a 


x 


` 


ournal of Applied Psychology 


Joun G. Dartey, Editor E 
å University of Minnesota ; 


Siaa OT 


Consulting Editors 


AROLD E. Burtt, Ohio State University ALEXANDER Minrz, City College of New York 


5 . ‘ \ 
LPHONSE Cuapanis, Johns Hopkins Univer- Haroto F. Rorue, Fairbanks, Morse and 
sity Company 
uirForD E. Jurcensen, Minneapolis Gas Jurian B. Rorrer, Ohio State University 


Company 


z r Tuomas A. Ryan, Cornell Universit 
AURENCE S. McGaucuran, University of d D ‘ 
Houston Donan E. Super, Columbia University 


m 
UINN MCNEMARY Stanford University Mites A. Tinker, University of Minnesota 
` 


wa ALFRED C, Wetcn, University of New Mexico s r 
.~ k r š 
Artuur C. Horrman;\Managing Editor 
Ñ 
HELEN Orr, Promotion Manager 
ea 


Saran Womack, Editorial Assistant : 


E & FPsyl Research 


(ae 


oe 


Published bimonthly by the American Psychological Association, Inc. 


Prince and Lemon Sts., Lancaster, Pa. and, 1333 16th St. N.W., 
Washington 6, D. C. 


Entered as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the act of March 3, 1879 


ili cial rate of postage provided for in paragraph (d-2), Section 34.40, 
Peceptance for: mailing at He Sper OE 1948, authorised October 10, 1947 


Copyright © 1958 by the American Psychological Association, Inc. 


j parm Eas” Research | 


i 4G GOL “AR 


AWe we pe 


LANCASTER PRESS, 


INC., LANCASTER, PA. 


Contents of Volume 42 


; born, M. See Rubenstein, H. 
Alluisi, E. A., and Martin, H. B. An Information Analysis of Verbal and Motor Re- 


sponses to Symbolic and Conventional Arabic Numerals........ 79 
Anderson, J. K. See Miner, J. B. 
Ashcroft, S. See Meyers, E. 
Astin, A. W. Dimensions of Work Satisfaction in the Occupational Choices of College 
TS a O E E T Powe E gee Ge E TE 187 
Bamford, H. E., Jr., and Ritchie, M. L. Complex Feedback Displays in a Man-Machine 
E Spem. cine ages ris rma <eus spati ype =Y FEA EAEN eia e dae eaa > rsata w eat a oie i 141 
Barch, A. M. Judgments of Speed on the Open Highway........-.-.-.:e0seeeeeeeees 362 
Bass, B. M. See Pennington, D. F., jr: 
Beldo, L. A. See Longstaff, H. P. 
endig, A. W., and Stillman, Eugenia L. Dimensions of Job Incentives Among College 
ShidentS. ses uoe 4 ER EDE E Seas ama s agia ae a peranan SAES DIRA 4 Ree a BES E sarad aa e i 367 
enson, P. H., and Peryam, D. R. Preference for Foods in Relation to Cost. .......... 171 
= Bolda, R. A. See Lawshe, C. H. 
Borg, W. R., and Tupes, E. C. Personality Characteristics Related to Leadership Be- 
havior in Two Types of Small Group Situational Problems... ....-- 0... +00 sss sees 252 
_ Borislow, B. The Edwards Personal Preference Schedule (EPPS) and Fakability....... 22 
rown, R. L. Wrapper Influence on the Perception of Freshness in Bread............. 251 
rune, R. L. See Lawshe, C. H. 
uffa, E. S., and Lyman, J. The Additivity of the Times for Human Motor Response 
Elements in a Simulated Industrial Assembly Task........-. 600s sees serene eee es 379 
Campbell, D. T., Hunt, W. A., and Lewis, Nan A. The Relative Susceptibility of Two 
Rating Scales to Disturbances Resulting from Shifts in Stimulus Context............ 213 


‘Chalmers, W. E. See Stagner, R. 
Chambers, B. See Stockbridge, H. C. W. 
Clarke, W. V. See Merenda, P. F. 
Comrey, A. L. A Factor Analysis of Variables Related to Driver Training.............- 218 
Conklin, J. E., and Lindquist, O. H. Recovery From Unusual Aircraft Attitudes Under 
the Influence of Vertigo... ea Kera us u len n oyes erime a eme p enn € EA essere eee sues 136 
Bones, R. Accuracy of Recall Using Keyset and Telephone Dial, and the Effect of a 
yi Pren Dipit p p o ns eama « mene x dbo aaiae E BSE © ESE SRE i arasa gebe ai mnn e eccone sing npa A 285 
Crannell, C. W. See Debons, A. 
Creager, J. A., and Harding, F. D., Jr. A Hierarchical Factor Analysis of Foreman Be- o 
BAO = & ea E a sian ae sea DUOPE SE RAE ariwa aaga sga masse sieh ganS E ea a 1 
Debons, A., and Crannell, C. W. The Legibility of “Scotchlite” Versus Other Materials.. 389 
Decker, R. L. A Study of the Value of the Owens-Bennett Mechanical Comprehension Test 
(Form CC) as a Measure of the Qualities Contributing to Successful Performance as a 
Supervisor of Technical Operations in an Industrial Organization...........-.+-+++- 50 
DeGidio, JoAnne. See Dunnette, M.D. 
Derber, N. See Stagner, R. 
DuBois, P. H. See Manning, W. H. 
‘du Mas, F. M., and MacBride, K. A Manifest Structure Analysis of the Otis S-A Test of — 


Mental Ability, Higher Examination : Form 1: S A E o 269 


Dunnette, M. D., Kirchner, W. K., and DeGidio, JoAnne. Relations Among Scores on 
Edwards Personal Preference Schedule, California Psychological Inventory, and 


Strong Vocational Interest Blank for an Industrial Sample... . -o -5s err 178 
England, G. W., and Paterson, D. G. Relationship Between Measured Interest Patterns 
and Satisfactory Vocational Adjustment for Air Force Officers in the Comptroller and 
DarxGatiel Kelders « « upe « sai X puto me eer metal SHA PTS E mt RG nope Re 85 
Ethington, Doris. See Meyers, E. ; : 
Eysenck, H. J. A Short Questionnaire for the Measurement of Two Dimensions of 
pA UTES $ EE 14 


PEONATI: yer cm q+ nase a AR FEU eis ene Se ae Pe «ea 
iii 


iv Contents of Volume 42 


Fine, B. J. The Comparative Effectiveness of Some Psychological and Physiological 
Measures in Ranking the Impact of Diverse Environmental Conditions.............. a 
Fine, B. J., and Haggard, D. F. Contextual Effects in Scaling..................... || 94 
-C. See Glaser, R. 
eee M. M., and Friedman, M. P. A Factor Analysis of Aptitude 
and Proficiency Measures in Radiotelegraphy... 2... 129 
Fleishman, E. A. See Highland, R. W. 


Friedman, M. P. See Fleishman, E. A. 


Garman, G. D., and Uhr, L. An Anxiety Scale for the Strong Vocational Interest Inven- 
tory: Development, Cross-Validation, and Subsequent Tests of Validity..........0... 241 


Gebhart, G. G., and Hoyt, D. P. Personality Needs of Under- and Overachieving Fresh- E 
5 
A EEN E eE a O ugsniumnreomer 
Glaser, R., Schwarz, P. A., and Flanagan, J. C. The Contribution of Interview and Situa- 


Glickman, A. S., and Vallance, T. R. Curriculum Assessment with Critical Incidents... 329 
Goi, F. J. See Wells, W. D. 


iad eke Habs Sear E E E S auemns ong Senin, BOM 
Greenberg, H., and Hutto, Dolores. The Attitudes of West Texas College Students 

Be and ee sta pomos e a is s reeset en et Siiident 301 
Griew, S., and Tucker, W. A. The Identification of 


Differences in the Engineerin, 


278 

Groth, Hilde, and Lyman, J. Adequacy of the Residual Sensory Cues for Psychomotor 

Performance of Arm SEM wrth Min apaninenttitiasgrenruncc ce 323 

Groth, Hilde, and Lyman, J. Effects of Surface Friction on Skilled Performance with 
Hy ee eT e agree ee 273 

Groth, Hilde. See Lyman, J. 

Haggard, D. F. See Fine, B. J. 

Haravey, F. See Pennington, D. Ey Je ' 

Harding, F. D., Jr. See Creager, J. A. 

Harris, D. H. The Effect of Display Width in Merchandising Soap 285 

Hay, E. N. A Simple Method o 


Heath, R. W. A Ma 
with a Large Sample.. 
Highland, R. W., an 
Receiving Morse Code 
Hills, J. R. Controlled A; 
Holland, J. L. A Note o 


Hughes, J. L., and McN 


Selection and Counseling 


Limitations on the Use of Stron 
Humphries, M, 


g Sales Keys for 


eoa GE Pe OOR eA D aa omina or ns 93 
Performance as a Function of Control-Di 1 indie. oe mA 

the Operator, and Locations of the Control. |... Spay Relations, Positions of 311 
Hunt, W. A. See Campie TT ER anite E asia mae Diere seint 


Hutto, Dolores. 
Jeantheau, B $ 


Jenkins, W. L, The Superi 


5 peration of Small Control Knobs 97 
enson, P, G. Relationshi B 
f United States Air Pent = — Stated and Measured Interests of Two Groups of 33 
udy, C. J. Field Training y rec ing for Modea aie 
a New Weapon Semi i = Sealant School Training for Mechanics Maintaining 4 
amenetzky, J, See Schutz, H, BERN T i aor = 
Kausler, D. H., and Tra 


j Pp, E. P, Anxiety Level and S 
Kay, B. R Intra-Individual Differences in Sensory mn 


Biographical Inventor .. 305 
Channel Pr : 


C ose iio ging ce 166 


Contents of Volume 42 v 


Kennedy, J. E. A General Device Versus More Specific Devices for Selecting Car Sales- 


ME eran aea E oven x, vain T E S HERE HERS © KOE SINS © RM AGL Sy g Wem Mietaa A BAGATELE E 206 
Kennedy, J. E., and O'Neill, H. E. Job Content and Workers’ Opinions............... 372 
Kenyon, G. Y., and Pronko, N. H. Identification of Cola Beverages.. sss sereis osre smse 419 
Kerr, W. A. See Yeslin, A. R. 

King, L. A. Factors Associated with Vocational Interest Profile Stability............... 261 


Kirchner, W. K. See Dunnette, M. D. 
Krug, R. E. The Effect of Specific Selection Sets on a Forced-Choice Self-Description 

Inventory’. weiss neea nega RERE OWE NOES wines mnie ga ta a anaa PERE PSG GENT A AES matea PoE 
Krug, R. E. A Selection Set Preference Index.... 


Lana, R. See McGinnies, E. 
Lawshe, C. H., Bolda, R. A., and Brune, R. L. Studies in Management Training Evalua- , 


tion: I. Scaling Responses to Human Relations Training Cases.............-+..++ 396 
Lawshe, C. H., and Patinka, P. J. An Empirical Comparison of Two Methods of Test 
Selection and Weighting. ero 2.0. c88s) ios Henn ewes ven en nnn ne needs HANS daear 210 


Lewis, Nan A. See Campbell, D. T. 
Lindquist, O. H. See Conklin, J. E. 
Loeb, M., and Jeantheau, B. The Influence of Noxious Environmental Stimuli on Vigi- 
Dear Cin ax ce Ea cess scone siTB END, HCE E SCE hag. “ale BETS BLT Fens ARES A sere be Faas deans eho 
Longstaff, H. P., and Beldo, 
Alternate Forms are Used... . 
Lyman, J., and Groth, Hilde. 
for Barownd Cloved Hands «acacia nike EPPEN naain m ARAI INE EAE SE 18 
Lyman, J. See Buffa, E. S. 
Lyman, J. See Groth, Hilde. 
MacBride, K. See du Mas, F. M. 
McGinnies, E., Lana, R., and Smith, C. Th 


109 


e Effects of Sound Films on Opinions About 


Mental Illness in Community Discussion Groups... -ccc 40 
McNamara, W. J. See Hughes, J. L; 
Maher, H. See Morrison, R. F. , 
Mahoney T. A. Weighted Application Blank Analysis of “Contingency” Items........- D 
T.A. oe 


Mann, J. H. Self-Ratings and the EPPS... nery +--+ +-+0 0 en nenne yne oe go 
Manning, Wi Freaad DuBois, P.H. Gain in Proficiency as a Criterion in Test Validation. 191 


Marke, S. See Smith, G. 


Martin, H Alluisi, E. A. 

Merenda eo ee lak W. V. AVA as a Predictor of Occupational Hierarchy. eeM 289 
Meyers, E. ` Ethington, Doris, and Ashcroft, S. Readability of Braille as a Function of i 
Pireu oasis VaR blES n x iach IEn sare tedi RENTS SHES veo args ee sta cai 

Miner, T er Li J. K. The Postwar Occupational Adjustment of Emotionally 
Disturbed Soldiers; ge mra vens pit ett wsi Rey T E srrssorerren: 317 
lorrison, R. F., and Maher, H. Matching Indices for Use in Forced-Choice Scale Con- m 
F aa O E TES ERT A A | oS 
Myers, J, H An Experimental Investigation of “Point” Job Evaluation Systems........ 357 
Nickels T B and Renzaglia, G. A. Some Additional Data on the Relationships Between 
2 Expressed and Measured Values. ... oss «ews rere nstd ohersreenssereescuee eres nas 99 
Nye, C. T. See Rothe, H. F. 
> Neill, H. E. See Kennedy, J. E- 
aterson, D. G. See England, G. W: 
atinka, P. J. See Lawshe, C. H. i 
ennington da a jr, Haravey, F., and Bass, B. M. Some Effects of Decision and 
a og Goaler 5] aiid ERSCHVEMeSBi cae aeann HE ERA 404 


Discussion on Coalescence, Change, 
Peryam, D. R. See Benson, P. H- 
Porter, L. W. Differential Self-Perceptions o 
Pronko, N, H. See Kenyon, G. Y. 

ambo, W. W. The Construction an 
€nzaglia, G. A. See Nickels, J. B. 


f Management Personnel and Line Workers 105 


d Analysis of a Leadership Behavior Rating Form... 409 


vi Contents of Volume 42 


Ritchie, M. L. See Bamford, H. E. 

Roberts, M. M. See Fleishman, E. A. : 

Rathé, H. F., and Nye, C. T. Output Rates Among Coil Winders..................... 1 
Rubenstein, H., and Aborn, M. Learning, Prediction, and Readability................. 2 


Schutz, H. G., and Kamenetzky, J. Response Set in Measurement of Food Preference.. 175 
Schwarz, P. A. See Glaser, R. 


Seader, S. See Wells, W. D. 
Seibert, W. F. A Study of the Purdue Non-Language Adaptability Test 
Sheldon, M.S. See Sorenson, A. G. 

Small, A. M., Jr. See Greek, D. C. 

Smith, C. See McGinnies, E. 

Smith, G., and Marke, S. The Influence on the Results 


Inventory by Changes in the Test Situation: A Stu 
Temperament Scale 


of a Conventional Personality 
dy of the Humm-Wadsworth 


. 224 
Smith, G., and Marke, S. The Internal Consistency of the Humm-Wadsworth Tempera- 
aa IE RT yes cane 2 5 4B s san gona tomas oes 234 
Soar, R. S. Numeral Form as a Variable in Numeral WASIBUT EY. ia wi msec Se sie a oon ccc 158 
Solem, A. R. An Evaluation of Two Attitudinal Approaches to Delegation.............. 36 
Sorenson, A. G., and Sheldon, M. S. A Further Note on the Fakability of the MTAI..... 14 
Spector, A. J. Changes in Human Relations Attitudes. |... 154 
Stagner, R., Chalmers, W. E., and Derber, M. Guttman-Type Scales for Union and 
Management Attitudes HOWarM Each OHE.. a saci sigs rna aan aa ao a T a 293 
Stillman, Eugenia L. See Bendig, A. W. $ 
erige, H. C. W., and Chambers, B. Aiming, Transfer of Training, and Knowledge 8 
M E E a a a 
Stoltz, R. E. Development of a Criterion of Research Productivity 308 
Teichner, W. H. S arton Time I the [a a Soca anaa nn T am e r AAA 5 
Tinker, M. A. Length of Work Periods in Visual Resëarth. aeo e s a 348 
Torrance, E. P, Sensitization Versus Adaptation in Preparation for Emergencies: Prior 
Experience with an Emergency Ration and its Acceptability in a Simulated Survival 
EE ST TTE O a Sean 63 
Trapp, E. P. See Kausles, D. H. i 
Tucker, W. A, See Griew, S. 
Tupes; E. C. See Borg, W. R. 
Uir L.. Sze Garman, G. D. 
Vallance, T. R. See Glickman, A. S. 
Vernon, L. N. See Yeslin, A. R. 
Walker, K. F. A Study of Occupational BLE AO RIDER ta ee mann Si Mp PRA Gs 122 
Weaver, C. H. The Quantification of the Frame of Reference in Labor-Management 
e ae ete eon anlbah arpieieanlgelimnaiiaa ene 1 
Wells, W. D., Goi, F. J., and Seader, S. A Change in a Product Image ee! 
res Y: W. Interdependence of Successive Absolute Judgments.. Paseo 416 
Peon avira” "SSSA The Sigmifeance of Fme Spe i Answering 268 
Ziller, R. C. Communication Restraints, Group Flexibility, and Group Confidence 346 


y> 


Journal of App 


2 


VoL. 42, No. 1 


FEBRUARY, 1958 


The Quantification of the Frame of Reference in Labor- 
Management Communication * 


Carl H. Weaver °? 


Ohio State University 


One of the barriers to certain kinds of com- 
munication between management and labor is 
the effect which the frame of reference has 
upon the concept evoked in members of one 
of these groups by a symbol used by a mem- 
ber of the other group. This research was an 
attempt to quantify the barrier posed by dif- 
ferences between the frames of reference of 
these two groups in the area of industrial 
relations by means of the semantic differential 
technique. 


Problem 


The problem and the theory of the research 
Were described in a preliminary report (14). 

riefly, the problem concerned that type of 
Communication in which management at- 
tempts to persuade labor to accept manage- 
Ment’s point of view and thus effect a change 
in the behavior and attitude of the labor 
Sroup. Accurate measurement of the frames 
Of reference of these two groups might aid in 
8ppraising the common tendency of manage- 
Ment to resort to another medium of com- 
munication when current methods fail. 

In terms of concepts and meaning, the 
frame of reference (the “apperceptive mass 
of educators) appears to be about what se- 
Manticists mean when they speak of a listen- 
€t’s previous experience. A symbol may by 
explicit or written agreement stand for any- 
thing which may be agreed upon. In some 
Scientific disciplines this agreement may be 
fairly precise. This is not the case, however, 


with most of the symbols used in communi- 
B 


ee Based on a dissertation directed by Franklin H. 
nower, 
“Now at Central Michigan College. 


jurë yu Ednl. 


cation. Concepts developed in the way dem- 
onstrated by Fisher (4) and Hull (6) are 
personal and individual. They are built 
through the process of generalizing abstrac- 
tions from individual experiences with the ob- 
ject or process conceived. When a communi- 
cator uses a symbol to convey to a communi- 
catee a meaning which he has in his own 
mind, he can only evoke in the mind of his 
listener the concept which has been developed 
there through the listener’s own past experi- 
ences with objects and processes which he has 
considered, consciously or not, to be related 
to that symbol. 

The semantic distance between the concept 
evoked in the communicatee and the concept 
intended by the communicator is a semantic 
barrier to communication. The concepts ac- 
cumulated by a person give him a frame of 
reference through which he observes and 
evaluates the objects and processes of the 
external world. The frame of reference in- 
fluences the concepts in their formation and 
change. One of the determinants of the for- 
mation of concepts is the group norm; group 
members who have internalized well the 
norms of a reference group are likely to hold 
similar concepts and similar frames of refer- 
ence toward objects and processes related to 
the norms of that group. In the research re- 
ported here, the frame of reference was es- 
tablished by measuring the connotative mean- 
ing of selected symbols in the area of labor- 
management relations. It was believed that 
the labor and management groups had social 
norms in this area which were different from 
each other; thus, members of the two groups 
should respond to the test in different ways. 


aay. Research 
' agsing GV GE 


2 Carl H. 


Method 


The semantic differential, developed by Osgood 
and others, was used to measure the meaning of 
concepts selected from the area of industrial rela- 
tions. The theory and technique of the semantic 
differential have been adequately described elsewhere 
(7, 8, 9, 12). i 
` The meaning of a symbol (called a “concept”) is 
measured by asking S to mark on a seven-point 
scale between two logically or psychologically op- 
posing terms the point at which he perceives the 
meaning to lie. The two extremes are called the 
“gradient.” This is an example: 


SENIORITY hott 1234567 


In the research reported here, the S drew a circle 
around the number which he believed best repre- 
sented the meaning of the concept (seniority) on the 
gradient paired with it (hot-cold). 

The concepts for this study were chosen after in- 
terviews with state and regional labor leaders and 
management executives, and after surveying the 
writings of such authors as Bakke (1), Reynolds 
(11), Chamberlain (2), Peters (10), Heron (5), 
Walker (13), and others. Since the purpose of this 
research was to measure a barrier to communication, 
only those areas in which the Positions of the labor 


cold 


Weaver 


group and the management group could be expected 
to diverge were listed. These can be seen in Table 1. 
It was believed that the 21 areas listed there in- 
cluded most of the important diverging social norms 
of the two groups, and that they could be subsumed 
under two broad categories: reducing the discretion 
of management and the struggle for the loyalty of 
the worker. Ten symbols were selected and listed 
in the right-hand column of the table which might 
be expected to evoke concepts related to one or 
more of these areas. 

The gradients which Were matched against these 
concepts were taken from a factor analysis reported 
by Osgood (8). They are listed in Table 2. Since 
this research was evaluative in character, only those 
gradients which had high loadings on the evaluative 
factor were selected. This list would not have been 
greatly changed had it been selected from the second 
factor analysis done by Osgood and Suci (9). The 
figures in parentheses after the gradients are the fac- 
tor loadings. The last gradient in the list was not 
used in the factor analysis but was used by Osgood 
in other studies, S 

Since each concept was paired with each gradient, 
the pilot test consisted of 300 items, arranged in 
random order. The sheets were rotated and stapled 
together with a cover page of instructions and 4 
final information sheet. This pilot test was ad- 


Table 1 
Derivation of the Concepts 


Areas of Opposing Norms 


Concepts 


A. Reducing the discretion of management 
1. Wages and profits 
2. Transfers 
3. Promotions 
4. Lay-ofis 
5. Hours 

6. Settlement of grievances 

7. Hiring (closed or union shop, 

8. Discipline 

9; Discharge 

10. Arbitration 

11 

12 

13 


- Rate of Production (pace, speed-up) 


.- Equal Pay (elimination of competition) 


. Job classification 


Struggle for the loyalty of the worker 
14. Enforced union membership 

15. Independent vs. internationa] union 
16. Industry-wide bargaining 

17. Support of other unions 

18. Working during a strike 

19. Attending meetings 

20. Collective ys, indi 


vidual bargaining 
21. Voting labor 


former employees) 


Seniority 


Grievance 


Arbitration 
Work quota 
Equal pay (for equal work) 


The closed shop 
The labor movement 
Working during a strike 


Individual bargaining 
Labor in politics 


s sa eee ee eee 


| 


Quantification of Frame 


of Reference 3 


Table 2 


Evaluative Gradients Taken from Osgood and Suci’s Factor Analysis 


1. good-bad (.88) 

2. beautiful-ugly (.86) 

3. sweet-sour (.83) 

4. clean-dirty (.82) 

5. kind-cruel (.82) 

6. pleasant-unpleasant (.82) 
7. bitter-sweet (.80) 

8. sacred-profane (.81) 

9. nice-awful (.87) 

10, fragrant-foul (.84) 
11. honest-dishonest (.85) 
12. fair-unfair (.83) 

13. tasty-distasteful (.77) 
14. valuable-worthless (.79) 
15. happy-sad (.76) 


16. ferocious-peaceful (.69) 
17. bright-dark (.69) 

18. healthy-sick (.69) 
19. fresh-stale (.68) 

20. brave-cowardly (.66) 
21. black-white (.64) 
22. calm-agitated (.61) 
23., rich-poor (.60) 

24. clear-hazy (.59) 

25. high-low (.59) 

26. empty-full (.57) 

27. relaxed-tense (.55) 
28. rough-smooth (.46) 
29. near-far (.41) 

30. up-down 


Note.—Figures in parentheses 


ministered to 25 Ohio State University students who 
Strongly favored and 25 who strongly opposed labor 
Unions, The average age of the pro-labor group 
Was 24.33 years and of the anti-labor group, 22.24 
‘eats: Twenty-four of the pro-labor group and 21 
of the anti-labor group were males. At the time of 
the test three of the pro-labor group were union 
Members but none of the anti-labor group. The 
Pro-labor group recorded a total previous union 
pembership of 35 years and two months. The anti- 
ke Or group recorded a total previous membership 

SIX years and six months. 
i Critical ratio of proportion technique was used 
° Select the differentiating items. Of the 300 items, 
l differentiated between the groups at the 5% 
evel of confidence, No gradient differentiated when 
Paired With either of the concepts equal pay and 
ork quota, The number of neutral scores given 
roe Concepts and the lack of consistency suggested 
o; A ‘hey were measuring concepts other than the 
Nes intended. The shortening of egual pay from 


one bay for equal work may have changed the 
test Pt These two concepts were dropped from the 


The Pretest 


tyre 12 gradients which differentiated between the 
° criterion groups with each of the remaining 
Eht concepts at the highest levels of confidence 


adaj, 7 tained and combined into a new test. In 
fous oy the gradient good-bad was paired with 


Concepts at the end of the test to make a test 
items. These items were combined randomly, 
that no gradient was allowed to follow itself 

Y and a concept was separated from itself by 

Other concepts, Since six of the concepts were 
from the labor point of view and two from 

@nagement point of view, the direction of the 


of 10 
&Xcept 
direct] 
ae 
Stat, 

thee 


after the gradients are the loadings on the evaluative factor in Osgood's factor analysis. 


continuum was not regular, These items were re- 
versed before tallying the responses. 

The pretest was administered to two labor and 
two management groups which were considered to 
be criterion groups. The labor groups were 20 local- 
union officers assembled at The Ohio State Univer- 
sity at a labor institute, and 48 members of the 
state council of an international union. All of these 
Ss were elected officers. It was hypothesized that 
one of the reasons they had been elected to their 
offices was that they had internalized well the norms 
of their groups. The management groups were 33 
members of a local unit of an international service 
club, all of whom had expressed prejudice toward 
the management point of view, and 38 members of 
an industrial association in a large midwestern city. 

The test was administered to the first three of 


C - Service Club 
D - Industrial Associa- 
tion 


A = State Labor Council 
B - Local-Union Officers 


AB C S 


The Closed Shop 

Grievance 

Arbitration 

The Labor Movezent 

Working During a Strike . 

Labor in Politics 

Seniority 

Individual Bargaining 3 7 À ; > : ; 

Fic. 1. Profiles of two labor groups and two man- 
agement groups on the 100-item pretest, 


4 Carl H. Weaver 


these groups in group situations. The members of 
the industrial association received the test by mail 
from their executive secretary. The profiles of the 
four groups are shown in Fig. 1. 


Reliability of the Pretest 


Split-half reliability coefficients, corrected for length 
by the Spearman-Brown formula, were computed for 
labor (r= .96) and for management (r=.96). In 
addition, the product-moment correlation coefficient 
of the two labor groups (r=.93) was computed, 
and that of the two management groups (r = .85). 
This was the equivalent-group method of determin- 
ing reliability used and reported by Osgood and 
Stagner (12). The standard error of measurement 
for labor was .12 and for management .19 scale unit. 

The consistency of operation of the test may be 


observed by inspecting the statistics in Table 3. 
These are the differences between the mean responses 
of the labor and management groups for each item. 
The items are grouped under the heading of the 
concepts with which they are paired. The consist- 
ency with which the differences for each concept 
hover within a narrow range may be seen on this 
table. For example, except for an equivocal item 
(No. 42) the differences between labor and manage- 
ment on all gradients paired with the concept the 
closed shop range from 3.2 scale units to 3.9 scale 
units, a range of only .7 scale unit, Two other 
equivocal items may be seen on Table 3: one 
matched with labor in politics and one matched 
with working during a strike. The first two of 
these were marked almost randomly and showed no 
significant difference between labor and management. 


Table 3 


Differences Between Labor and Management 


Mean Responses on Individual Items in 


Scale Units on the 100-Item Pretest 


The Closed Shop 


Working During a Strike 


9. 3.9 58. 3.2 12. 3.7 41. 3.6 
13. 3.5 64. 3.7 19. 3.3 47. 3.1 
16. 3.5 74. 3.3 23. 3.0 57. 24 
a2. 3 77. 3.6 29. 1.7 66. 3.2 
49. 3.8 82. 3.3 32. 3.1 69. 3.7 
54. 3.8 95. 3.3 35. 4.0 94. 3.4 
99. 3.8 
Grievance Labor in Politics 
i, 25 56. 14 3. 3.8 61. 3.9 
foe 59. 2.1 1%; 3 75. 3.7 
i 233 73. 24 22. 3.5 78. 3.1 
39. 24 83. 2.4 26. 3.2 85. 3.3 
50. 2.3 88. 2.1 36. 3.4 89. 3.5 
53. 2.0 91. 2.0 46. 3.1 92. 2.9 
98. 2.4 
Arbitration iori 
e Seniority 
it 1.5 63. 1.3 20. 1.5 51. 14 
» E 7s. 214 25. 1.3 62. 1.5 
5: i 79. 14 30: 15 70. 14 
4 mH 84. 1.6 33. 1.6 76. 1.3 
E 87. 13 38. 1.5 93. 1.1 
6. 11 90. 2.0 44. 14 100. 15 
h 
s Sre pabor ne, Individual Bargaining 
» 2 . 3. hy 43 
: 24. 2.6 7. T a io 
i ae 34. 2.3 28. 1.7 65. 25 
< 2s 67. 3. 31, 22 68. 20 
14. 29 72. 3.0 37. 2.0 80. 18 
18. 2.9 81. 2.3 40. 1.9 86. 1. 
97. 2.6 ie aia 


Quantification of Frame of Reference 5 


Validity of the Pretest 


Osgood has discussed the validity of the semantic 
differential at some length (8). Perhaps the evi- 
dence for validity in this research lies in the sta- 
tistics which underlie Fig. 1. The position of each 
hee for each concept on this figure is the mean of 

e mean responses for all items matched with that 
Concept. It may be noted that on every concept la- 
ee marked toward one end of the scale and man- 
eee marked away from labor toward the other 
fe ge two groups maintained this direction even 

e concepts seniority and arbitration, where 
sea cement was consistently on labor’s end of the 

Ble; and on grievance, where management was ap- 
Proximately neutral. 
se might be expected that the members of the 
nome rig] association would have internalized the 
ine y of management more completely than the 

mbers of the service club would. About 55% of 

S Service club were engaged in business, but only 

à Were engaged in enterprises large enough that 
thie o ing norms of these two groups would be- 
of th Prominent in their experience, The remainder 
terprie 55% were engaged in small, unorganized en- 
Childress Such as jewelry stores, drug stores, a small 

f aun clothing store, etc. The remaining 45% of 
awy, ub was composed of professional men: doctors, 

a yers, judges, teachers, dentists, etc. On the other 
men y poe industrial association was composed of 
man vho were rather closely engaged in this labor- 

agement problem. About two-thirds of them 


ri ese, were industrial relations managers in name 
izeq pat One would expect them to have internal- 
ti Hk tter the norms of the management group and 
Tt pêk the items on the more extreme positions. 
difieren be seen in Fig, 1 that, although most of the 
tion Wi ces are not significant, the industrial — 
Mes tee More extreme on every concept, and the 
bon rer cross, 

two eke the same judgment may 
Counci] Or groups. One would expect the state labor 
labo, „t° have internalized better the norms of the 
tribut group, since adherence to group norms con- 

es to the popularity and leadership qualities of 


a 
Bro; A 
UP member, The positions of the local-union 


Pfficer 
bor <> are considerably below those 7 oe m 
nspect1o: 


~ Co FE 
Fig, Ta in the union hierarchy. 2 
as Shows that the two groups marked the items 


be made of the 


ex 1 
treme ted. The state labor council was more ex- 
at anp”? every concept, and the lines do not cross 


Point 


Results 


Th 
bor Meta of the item means for the 67 la- 
`s 4.6 Was 2.2 and for the 71 management 
The 
the p, !8nificance of the differences between 
“Ns of these two groups on individual 


items (Table 3) was computed by means of 
the ¢ test. Lack of homogeneity of variance 
among the item distributions was demon- 
strated by significant values for F. Conse- 
quently, a formula for ¢ which does not make 
the assumption of a normal population (and 
homogeneity of variance) was used to test 
the significance of the differences between the 
two groups (3). Ninety-seven of the 100 
items gave values for ¢ which were statisti- 
cally significant at the .1% level of confi- 
dence. The smallest value for ¢ among these 
97 items was 4.12. 

Inspection of Fig. 1 will show the relatively 
greater polarization of the labor group as op- 
posed to that of the management group. The 
approximately neutral position of manage- 
ment (4.6) seemed to have been caused by 
the strong trend toward the labor end of the 
continuum on seniority, by the milder trend 
in the same direction on arbitration, and by 
the neutral position on grievance. On the 
other five concepts, management assumed a 
position on the continuum opposite the end 
which labor chose, but used the extreme po- 
sition on the scale less often than the mem- 
bers of the labor group did. 

The items were evaluated in terms of Os- 
good’s D value. As used by Osgood (7), the 
D value was the scale distance between the 
neutral point on the scale and the mean re- 
sponse of the group on an item. Osgood con- 
sidered an item to have a satisfactory D value 
only if the distance were 1.5 scale units; in 
the present comparison of two opposing 
groups, the D value would have to be 1.5 
in both directions. Computation of the D 
values showed that a 20-item test could be 
constructed according to Osgood’s standard, 
but would include only three concepts and 
one of them would be measured by only one 
gradient. Reducing the D value from 1.5 to 
1.0 would make it possible to include more 
concepts but the number of gradients would 
be inadequate on all but one or two of them. 
This apparent incongruity between the ¢ 
values and the D values was caused by man- 
's tendency to mark on labor’s end 
of the scale or within 1.5 scale units of the 
neutral point. Labor polarized 1.5 scale 
units or more on 85 of the 100 items, but 


a gement 


6 Carl H. 


this was true of management on only 22 
items. 

An accompaniment of management’s tend- 
ency to polarize less well than labor was a 
tendency for the members of the management 
group to agree less well with each other. The 
mean standard deviation on the individual 
items was 1.525 for the labor group and 1.612 
for the management group, a difference of 
.087 scale unit, significant at the 5% level of 
confidence (¢ = 2.06). 


The 40-item Test 


Since almost all of the items produced 
highly significant differences between labor 
and management mean responses, the test 
was shortened by selecting for each concept 
the five gradients which produced the great- 
est scale differences. The means of the mean 
responses on these items were computed for 
each concept. Fig. 2 compares the profiles 
of the two groups on this 40-item test and on 
the 100-item pretest. 

Split-half reliabilities and standard errors 
of measurement were computed for this 40- 
item test as on the pretest. They are listed 
in Table 4. Thus, although the reliability 
coefficients for this test were not low, they 
were considerably lower than those secured 
on the pretest. The standard error of meas- 
urement was satisfactory for the labor groups 
but was much higher for the management 


Labor Managenent 

A - NO-item test C = 4O-item 
=: pretest 
B = 100-itom test D ~ 100-item pretest 


The Closed Shop 


- be 

Grievance 

Arbitration 

The Labor Movement 

Working During a Strike . 

Labor in Politics 

Seniority 

Individual Bargaining - la 

ae et Os 

Fic. 2. Profiles of labor and 


100-item pretest com 
40-item test. 


es io A 
nd management on the 
pared with their profiles on the 


Weaver 


Table 4 


Reliability Coefficients and Standard Errors of 
Measurement on the 40-Item Test 


Standard 
Reliability Error of 
Group Coefficient | Measurement 

Local-union officers 87 .18 
State labor council 89 13 
Total labor 89 13 
Service club ao 4 
Industrial association 82 42 
Total management .80 40 


groups than on the pretest. It became ap- 
parent in later computations that most of 


this effect was caused by responses on the ` 


three concepts grievance, arbitration, and 
seniority, on which management was either 
neutral or marked in the direction of labor: 


The 25-item Test 


Since, although results were not significantly 
different when the length of the test was re- 
duced to 40 items, the reliability was lowered 
and the error increased, the test was changed 
by dropping the three concepts on which man- 
agement did not polarize, leaving a test of 25 
items. This test consisted of the five remain- 
ing concepts, each paired with the five gradi- 
ents which showed the greatest scale differ- 
ences between labor and management. 


Table 5 


Reliability Coefficients and Standard Errors of 
Measurement on the 25-Item Test 


Standard 
Reliability Error of 
Group Coefficient | Measureme! 
Labor-union officers .89 16 
State labor council 96 10 
Total labor 95 abl 
Service club .91 10 
Industrial association 92 16 
Total management 92 12 


Quantification of Frame of Reference 7 
Table 6 
Mean Standard Deviations of the Item Distributions 
Mean Deviation 
Item Group Labor Management Difference l Significance 
100-item pretest 1.525 1.612 .087 2.06 5% 
40-item test 1.503 1.566 .063 1.03 None 
25-item test 1.604 1.542 -062 .72 None 
15-items (seniority, grievance, 1.335 1.603 .270 4.62 1% 


arbitration) 


Nelo reliability coefficients and standard 
te ors of measurement were computed for this 
St as for the others. They may be seen in 
thee 5. Comparison of these statistics with 
ce ra in Table 4 shows that when the con- 
oe S grievance, arbitration, and seniority 
dies dropped from the test, the great differ- 
st e between management and labor in the 
andard error of measurement disappeared. 
a € error became .11 scale interval for labor 
nd .12 for management. In terms of confi- 
ae limits this error was about one-third of 
Scale interval at the 1% level of confidence. 
al € improvement in reliability was noticeable 
So, although the test was reduced in length. 
Che advantage of the 25-item test over the 
st item test seemed thus to lie in its higher 
atistical reliability and lower standard error. 
a the other hand, these advantages could 
Ve been the result of some factor other than 
© test itself, e.g., a changing concept, which 
aad and Osgood believed to have a greater 
(12) deviation than a fixed stereotype 
dence This hypothesis was given some Cre- 
ey when the mean standard deviations 
stati computed for both of these tests. A 
eee are given in Table 6, along with the 
r n statistics for the 100-item pretest and 
Cept, he 15 items paired with the three con- 
its > On which management did not assume 
Tap Pected position. It is apparent from 
E 6 that many of the items On ie 
Ove, o nent spread its responses more widely 
the the scale, as indicated by significance © 
Stan ifference between labor and management 
center’ deviations on the pretest, were or 
wi ated in the 15 items which were paire 
Sho these three concepts. Management here 


Wed standard deviations which differed 


from labor’s at the .1% level of confidence. 
In the light of Stagner and Osgood’s findings, 
it was considered possible that the stereotypes 
tapped by these symbols were changing in the 
management group at the time the test was 
administered. If so, the apparent unreliabil- 
ity of the measuring instrument in these sub- 
areas may have been the unreliability of the 
concept being measured. 

Consequently, the 40-item test was ar- 
ranged with the expectation that the 25-item 
test could be scored out of it for further com- 
parisons. This test will be validated upon 
several subgroups in industry (e.g., line fore- 
men and clerical workers) and reported at a 


later time. 
Discussion 


Bakke’s conclusions after his interviews 
with 60 labor leaders and 60 business execu- 
tives were not completely supported by this 
study; nor were Chamberlain’s and Reynolds’ 
descriptions of management's antipathy to- 
ward all of the devices which labor has in- 
vented to restrict the discretion of manage- 
ment in running the business. From the re- 
ports of these and other writers, and from the 
author’s interviews with management, one 
would conclude that management holds so- 
cial stereotypes as extreme as those held by 
labor. i 
It seems possible that writers in this field 
have painted a picture of the theoretical -po- 
sition of the “good” union member who has 
internalized perfectly the presumed norms of 
his group, and of the “good” member of the 
management group. The results of this study 
suggest that the picture is further from the 
truth in the case of management than in the 


8 Carl H. Weaver 


case of labor. The member of the manage- 
ment group is not nearly so extreme as he is 
generally believed to be nor as he believes 
himself to be. Since the Ss in this study 
were selected because they were believed to 
be criterion groups, this does not seem to be 
an overstatement. 

These results suggest that the semantic 
barrier to communication is greater in the la- 
bor group than in management. Although 
the semantic differential as used here does not 
measure intensity of attitude, the more ex- 
treme positions of labor on the scale and the 
greater agreement among members of the la- 
bor group indicate a more restrictive opera- 
tion of the frame of reference in labor than 
in management. As a semantic problem in 
the kind of communication in which manage- 
ment tries to tell its side of the story to la- 
bor, this is important, 

Perhaps another aspect of the validity of 
the semantic differential should be consid- 
ered here. The Ss in this study were reacting 
to a symbol when they circled a number on 
the gradient-scale, The conclusions of the 
study were based on the inference that a con- 
cept was being measured. It is possible that 
the measurement was linguistic, not concep- 
tual, and that other symbols would have 
evoked other responses and other semantic 
distances, Thus, the conclusion drawn 
that labor provided a 
semantic barrier than 
only to this situation, 
these symbols, 


above 
greater share of the 
management may apply 
with these Ss, and with 


Conclusions 

The following conclu 
1. Management’s fr 
significantly different 
ently, management has a story to tell 


sions were drawn: 


e study. 
The Management group revealed meanings 
for some concepts which were more nearly 
like labor’s than its ow 


n members seemed to 
believe, 


3. There were 


semantic barriers between 
the labor and m 


anagement groups used in 


this study. The concepts evoked by these 
symbols in the labor listener are apparently 
not always the ones intended by the manage- 
ment communicator. 

4. The labor group stereotyped more than 
the management group and the stereotypes 
were more extreme. Members of the man- 
agement group agreed with each other less 
well and held less extreme frames of refer- 
ence than members of the labor group. Thus, 
the semantic distance seemed to have resulted 
more from labor's frame of reference than 
from that of management. . 

5. Management seemed to be leaving its 
traditional position on some of these con- 
cepts and moving in the direction of labor's 
position. 

6. The frames of reference of these two 
groups can be measured with the semantic 
differential and the strength of the semantic 
barrier quantified. 


Summary 


A semantic barrier to communication be- 
tween labor and management was quantified 
by establishing the frames of reference of la- 
bor and management criterion groups on the 
semantic differential, using concepts selected 
from the area of labor-management relations: 
Significant semantic distance between the tw0 
groups was revealed. Labor stereotyped more 
than management, and assumed more extreme 
scale positions. Thus, the semantic distance 
seemed to have resulted more from labor’s po- 
sition than from management's, The greater 
standard deviations of the responses of the 
management group on three concepts sug- 
gested that management's Position on these 
concepts was changing, 


Received April 17, 1957. 


References 


1. Bakke, E. W, Mutual survival, the goal of aie 
ions and management. New Haven: Labo! 
and Management Center, Yale Univer., 1947. 

2, Chamberlain, N. W., The challenge to manage- 


ment control. New York: Harper, 1948. 

3) Edwards, A. F. Experimental design in psycho- 
logical research. New York: Rinehart, 1953- 

4. Fisher, S, C, “The process of generalizing ab- 
straction; and its product, the general con- 
cept.” 


Psychol. Monogr., 1916, 21, No. ? 
(Whole No. 90). 


Quantification of Frame of Reference 9 


on 


. Heron, A. P. Beyond collective bargaining. 
Stanford: Stanford Univer., 1948. 

s Hull, C. L. “Quantitative aspects of the evolu- 
tion of concepts: an experimental study.” 
Psychol. Monogr, 1920, 28, No. 1 (Whole 
No. 123). 

7. Osgood, C. 

meaning.” 


a 


“The nature and measurement of 
Psychol. Bull., 1952, 49, 197-237. 
- Osgood, C. “Report on Development and Ap- 
plication of the Semantic Differential.” Ur- 
bana, Illinois: Institute of Communications 
Research and Department of Psychology, Uni- 
ver, Illinois. (Mimeo.) 
- Osgood, C., & Suci, G. 


oo 


po 


“A factor analysis of 


. Stagner, R., & Osgood, C. E. 


. Walker, R. G. 


. Weaver, C. H. 


meaning.” J. exp. Psychol, 1955, 50, 325- 


338. 


. Peters, R. Communication within industry. New 


York: Harper, 1950. 


- Reynolds, L. G. Labor economics and labor re- 


New York: Prentice-Hall, 1949. 
“An experimental 
analysis of a nationalistic frame of reference.” 
J. soc. Psychol., 1941, 14, 389-401. 

“The misinformed employee.” 
Harvard Bus. Rev., 1948, 26, 267-281. 
“Measuring the point of view 
J. Comm. 


lations. 


as a barrier to communication.” 
1957, 7, 5-9. 


Journal of Applied 


Psychology 
Vol. 42, No. 1, 1958 


Controlled Association Scores and Engineering Success 


John R. Hills 


Educational Testing Service, Princeton, New Jersey 


The data to be presented here were ac- 
quired during the course of research on en- 
gineering graduate placement tests conducted 
by Educational Testing Service A con- 
trolled association test was included in a 
large battery of tests taken by experienced 
engineers. The test asked the Ss to write as 
many synonyms as possible in a limited time 
for each of eight common words. The ex- 
aminees were told that they were taking a 
test of their ability to think of words that are 
related to a key word. Twelve minutes were 
allowed to complete the test. 


The Scores 


Six scores were developed from this test. 
One score was a straightforward count of the 
total number of words or phrases given as 
Tesponses. In terms of the previous factor 
analyses of scores of this type (1, 2; 3, 4, 5, 
8, 9), it was presumed that this score would 
reflect a composite of Originality and Associa- 
tional Fluency and would be related to super- 
visors’ ratings of success on the job. 

In an attempt to Separate the Originality 
and Associational Fluency aspects of the 
score, it was decided to score the test for 
common and uncommon responses (10). By 
pretesting engineering students,? data were 
obtained which made it Possible to tabulate 
the frequency with which various words were 
given as responses to each of the stimuli. 
Since any definition of “common” would be 
arbitrary, a number of such definitions were 
explored. Each of them is based on the same 
procedure, that of defining “common” 


words 
as the smallest set of words which wou 


ld ac- 


and Telegraph 
B. F. Goodrich Company, 
Machines Cor 
Corporation. 

# The cooperation of Princeton Universit: 
Westinghouse Education Center in the PIENE 
greatly appreciated. 


Company, 
International Business 
Westinghouse Electric 


10 


count for a specified percentage of all re- 
sponses given to each stimulus word by the 
pretest groups. Four sets of common words 
were examined, based on percentages of 20, 
35, 50, and 70. These scores are presumed 
to emphasize Associational Fluency at the ex- 
pense of Originality. 

The sixth score was a count of every word 
or phrase given as a response but noé included 
in the “common” response key based on 50%- 
This score was considered to represent “un- 
common” responses, the evidence provided by 
Guilford and his collaborators suggesting tO 
the writer that the ability to call to mind un- 
common or farfetched, but synonymous, words 
is an aspect of originality, similar to the uni 
commonness-of-response score on Guilford’s 


test of Number Associations (10, pp. 365- 
368). 


The Subjects 


The test battery was taken by 687 em 
ployed, experienced engineers in five com- 
panies. By a method described elsewhere 
(6), the Ss were grouped according to the 
type of engineering work (the function) they 
were performing. There were six of these 
functional groups of engineers: I, Research; 
II. Development, III. Application, IV, Opera- 
tions, V. Supervision, and VI. Sales. 

Supervisors’ ratings of job success were ob 
tained. Some were abstracted from records 
maintained by the companies; others were 0b- 
tained specifically for this study. All rating’ 
were converted to rank orders within wo” 
groups of engineers in each company, a” 
these were converted to percentile ranks i” 
order to account for different group sizes. d 

The 687 Ss were divided into analysis a” 
cross-validation subgroups of 400 and 281 
cases, respectively. Group A (the analys! 
group) was composed of 50 cases rated hig 
on job success and 25 rated low from each 9 
the functional groups except III, Applications: 
For that group, 25 cases covering the entit? 


a 


Beal 


Controlled Association Scores and Engineering Success 14 


Table 1 
Distribution of 287 Cases in Group B 

Ratings of Develop- Appli- Opera- Super- 

Supervisors Research ment cation tions vision Sales 
High 22 51 46 31 27 
Mixed 23 
Low 15 25 25 12 10 
Total 37 76 23 71 43 37 

range of job success ratings were used, be- Table 2 

Cause the number of Ss in this group was Validities of Five Scores for Group A 

rather small. The remaining cases were as- 

Signed to the cross-validation group (Group Keys 

B), the cases being distributed as shown in ~ 

Table 1. 20% 35% 50% 70% 100% 

Analysj ; Score reliability 12; 31 54 45% 80 
lysis of the Scores Validity Ai as A? ot eal 

.20 .12 


In analyzing the data from Group À, it 
Was assumed for the purpose of analysis of 
Variance that company differences and inter- 
actions involving companies were negligible. 
To obtain a preliminary indication of whether 
the Controlled Association ‘Test scores were 
associated with either the job-placement or 
the success criterion (or their interaction), 
over-all F ratios were computed treating the 

i Subgroups of Ss in Group A as a one-way 
Classification. For only one of the scores, the 
Score based on uncommon responses, was the 

Tatio not significant at the .05 level. No 
further use was made of this score in this 
Study, It appears that Originality, if that is 
what this score measured, has little to do 
With either placement or success for these Ss. 

To determine the relationships involving 

€ other scores treating the two criteria sepa- 
rately, two types of analysis of variance were 
Used. First, for each score a one-way classifi- 
Cation analysis was computed treating each 
Of the six functional groups as a level. None 
of these F ratios was significant. The null 

YPothesis that the scores are unrelated to 
differences in functional groups was not re- 
Jected. Second, Functional Group III was 
Omitted (it being a single, heterogeneous 
8toup), and two-way classification analyses 
Were computed, treating functional groups as 
ne classification and the two levels of job 


Corrected validity 33.21 .23 


success as the other. For all five scores, the 
interaction F ratios were insignificant: How- 
ever, the F ratios for job success for all five 
scores were statistically significant. 

The odd-even item reliability coefficients 
for each of these five scores for the Ss in 
Group A, the correlations between test scores 
and ratings of success, and the correlations 
corrected for unreliability of test scores ap- 
pear in Table 2. 

The correlations with success ratings, the 
test score reliabilities, and the correlations 
corrected for unreliability of the test scores, 
for Group B, the cross-validation sample, ap- 
pear in Table 3. 

On the basis of these data it appears to the 
writer that a perfectly reliable score based on 


Table 3 
Validities of Five Scores for Group B 
Keys 
20% 35% 50% 70% 100% 
Score reliability 34 52 AS 63 80 
Validity 09 08 -06 .06 03 
Corrected validity 15 Al .09 07 04 


12 John R. 


a system like that of the 20% key might 
have a validity of about .20 for this criterion 
with Ss like these. While this is not spec- 
tacularly high as a validity coefficient, it is 
higher than any others found so far in ETS 
research on engineering graduate placement 
tests, including a large number (over 50) of 
scores from varied measures of abilities, tem- 
perament traits, motivation, and interests. 
Further indication of the possible value of a 
score of this type is derived from the finding 
that in Functional Groups I, II, III, and IV, 
the common response score contributed sig- 
nificantly to the multiple correlations between 
job success and test scores (7). This indi- 
cates that the common response score, which 
was assumed to measure Associational Flu- 
ency, taps variance which is not better meas- 
ured by other tests that have been ‘tried so 
far. 

One must not overlook a salient feature of 
the cross-validation set of data. None of the 
uncorrected validity coefficients on this set of 
data were statistically significantly different 
from zero at the 5% level. However, neither 
are any of them significantly different from 
their counterparts of Group A. Since there 
was no reason to predict shrinkage here, 
Groups A and B were compared in several 
ways to see if a cause for the lower validities 
could be found. No difference between Group 


A and B was found which would explain the 
shrinkage. 


The score range, mean, and stand- 
ard deviation for each key were very simil 
in both groups, as can be seen in Table 
The range, mean, and standard deviation 
the success ratings also were y 
can be seen in Table 5, 


ar 
4. 
of 
ery similar, as 


Table 4 
Range, Mean, and Standard Deviation of Scores 
for Five Keys on Groups A and B 
Range M o 

Group A B A B "A B 

20% 1-13 1-12 6.88 6.84 1.99 1.96 

35% 2-21 2-18 11.32 11.29 3.01 3.00 

50% 3-28 4-27 15.96 15.71 4.28 403 

10% 6-38 6-40 22.90 22.57 6.22 5.91 
100% 12-92 9-89 41.18 40.51 12.31 12.40 


Hills 


Table 5 


Range, Mean, and Standard Deviation of Success 
Ratings for Groups A and B 


Range M o 
Group A 1-8 4.31 2.08 
Group B 1-8 4.40 2.16 
Summary 


Six different scores from a controlled asso- 
ciation test have been studied, using a large 
sample of experienced engineers as examinees, 
and using criteria of job success and job 
placement. The test asks Ss to write, in a 
limited time, as many synonyms as they can 
to eight common words. None of the six 
scores from this test appears to be related to 
job placement, but the five of them which 
are based on the number of common responses 
given seem to be related to job success. These 
five scores vary along a continuum of com- 
monness of response words, from a very strin- 
gent definition of commonness to a definition 
which includes all responses given. Although 
most of the validity coefficients computed in 
this study are well below correlations of .20; . 
it is estimated that by using a revised test 
format a validity approaching .20 could Þe 
obtained in a similar testing situation. Al 
though not spectacularly high, such a validity 
coefficient for ratings of success is promising 
in comparison with over 50 other variables 
which were studied in ETS research on engi- 
neering graduate placement tests. The most 
Promising score on the Controlled Associatio" 
Test is the one based on the most stringent 
definition of commonness. A score based on 
number of uncommon responses was not sig” 
nificantly related to the criteria. 


Received February 11, 1957. 


References 


1. Guilford, J. P., Kettner, N. W., & Christense™ 
P. R. The relation of certain thinking fac 
tors to training criteria in the U, S. Coast 
Guard Academy. Rep. from Psychol. Lab» 
Univer. So. Calif., 1955, 13. 

2. Guilford, J. P., Kettner, N. W., & Christense™ 
PER A factor-analytic study across the gae 
mains of reasoning, creativity, and evaluatio? 


Controlled Association Scores and Engineering Success 13 


II. Administration of tests and analysis of re- 
sults. Rep. from Psychol. Lab., Univer. So. 
Calif., 1956, 16. 

- Guilford, J. P., Wilson, R. C., & Christensen, 
P. R. A factor-analytic study of creative 
thinking II. Administration of tests and 
analysis of results. Rep. from Psychol. Lab., 
Univer. So. Calif., 1952, 8. 

4. Guilford, J. P., Wilson, R. C., Christensen, P. R., 
& Lewis, D. J. A factor-analytic study of 
creative thinking I. Hypotheses and descrip- 
tion of tests. Rep. from Psychol. Lab., Uni- 
ver. So. Calif., 1951, 4. 

. Kettner, N. W. An information summary of 
studies of thinking abilities. Los Angeles: 
Univer. So. Calif., 1955. (Mimeo.) 


w 


an 


6. Saunders, D. R. Use of an objective method to 
determine engineering job families that will 
apply in several companies. Res. Bull. 54-26. 
Princeton: Educational Testing Service, Sep- 
tember, 1954. (Multilithed.) 

7. Saunders, D. R. Fifth progress report on Engi- 
neering Graduate Placement Test Research. 
Princeton: Educational Testing Service, Sep- 
tember, 1955. (Multilithed.) 

8. Thurstone, L. L. Primary mental abilities. 
Suppl. to Psychometric Monogr., 1934, No, 1. 

9. Thurstone, L. L. Primary mental abilities. Psy- 
chometric Monogr., 1938, No. 1. 

10. Wilson, R. C., Guilford, J. P., & Christensen, 
P. R. The measurement of individual differ- 
ences in originality. Psychol. Bull., 1953, 50, 
362-370. 


Journal of Applied Psychology 
Val. 42, No. 1, 1958 


A Short Questionnaire for the Measurement of Two 
Dimensions of Personality 


H. J. Eysenck 


Institute of Psychiatry (Maudsley Hospital), University of London 


In a previous paper, the writer has de- 
scribed the construction of two 24-item ques- 
tionnaires for the measurement of neuroticism 
and extraversion (1). The studies described 
in this paper were based on item analyses of 
some 250 questions appearing in well-known 
inventories, as well as a factor analysis of 
the finally chosen 48 questions, carried out 
separately for 200 men and 200 women. The 
reliabilities of the new questionnaires were 
reasonably high, in spite of their relative 
shortness, being .88 for neuroticism and .83 
for extraversion. The independence of the 
two scales was demonstrated by the low cor- 
relation of — .09 for the original sample of 
400 men and women, and the even lower cor- 
relation of — .07 for a further male group of 
200. Factorially, too, the items chosen for 
the two scales fell into two clearly separated 
groups, making rotation to simple structure 
easy. A limited number of validation studies 
have been carried out, and are quoted in The 
Dynamics of Anxiety and Hysteria (2), 

For many practical purposes, such as work 
in market research, for instance, even a rela- 
tively short questionnaire containing 48 ques- 
tions may be too long, and the present study 
was designed to investigate the Possibility of 
using an even shorter version containing only 
6 questions for each of the two scales, 


Subjects and Method 


The subjects of the investigation were approached 
on a quota sample basis by the interviewers of one 
of the largest and most experienced British Market 
Research organizations; these interviews are carried 
out all over England, correct Proportions of urban 
and rural dwellers, and of the different regions of 
the country being ensured. In addition to sex, the 
sample was divided according to age, 35 being the 
dividing line. Social class was assessed in the usual 
manner, the dividing line being taken between classes 
A, B, and C on the one side, and D and E on the 
other, 

The total sam 


; ple consisted of 1,600 subjects, di- 
vided into 8 gro 


ups of 200 each on the basis of the 


14 


three selection criteria taken in all possible com: 
binations. The reliability of sex and age classifica- 
tions is known to be reasonably high; that for class 
is rather lower (3). We may expect these unreli- 
abilities to lead to a varying degree of attenuation 
in our results. 

In the interview, a number of questions were first 
asked relating to a variety of commercial products; 
these constituted the ostensible purpose of the inter- 
view. A few personal questions about age and 0C-_ 
cupation followed, and finally the interview wae 
terminated with the 12-item personality question- 
naire given below. The questions were asked by the 
interviewer, and the answers written down by k 
The proportion of subjects approached who refuse 
outright was 7%; the proportion of subjects yi 
consented to answer the questions in the first par! 
of the interview and refused to answer the ques- 
tions in the personality inventory was only 2%. y 

The questions used in the study are given in 


Table 1. Each question answered “Yes” was so 
plus one point for Neuroticism (marked “N” in t 
key 


) or Extraversion (marked “E” in the key): 
cach question answered “No” was scored minus one 
point for Neuroticism or Extraversion, respectively: 
as shown in the key. No points were given for an- 
swers which could not be clearly classified as either 
“Yes” or “No” by the interviewer. The possible 
range of scores on either factor is therefore from 


plus six points to minus six points, a total of twelve 
points. 


Results 


Tetrachoric correlations were run betwee? 
the twelve questions, and the resulting table 
of correlations factor analyzed. Thurstone $ 
procedure was followed, and the two highly 
significant factors emerging were rotated 1” 
accordance with the principle of simple struc 
ture (4). Table 2 gives the factor loadings 
of the rotated factors. Also given in Table 
are the loadings of the 12 items which they 
had originally had in the analyses carried out 
on the whole population of 200 men and 200 
women for all 48 items (1). The compari“ 
son shows that the figures are remarkably 
similar from one occasion to the other, d 
though methods of selection have chang 
considerably, and although in the origina 


A Short Questionnaire 15 


Table 1 
Questions n Key 
A. Do you sometimes feel happy, sometimes depressed, without any apparent reason? N 
B. Do you prefer action to planning for action? 
C. Do you have frequent ups and downs in mood, either with or without apparent cause? N 
D. Are you happiest when you get involved in some project that calls for rapid action? E 
E. Are you inclined to be moody? 
F. Does your mind often wander while you are trying to concentrate? N 
G. Do you usually take the initiative in making new friends? E 
H. Are you inclined to be quick and sure in your actions? E 
I. Are you frequently “lost in thought” even when supposed to be taking part in a conversation? N 
J. Would you rate yourself as a lively individual? . E 
K. Are you sometimes bubbling over with energy and sometimes very sluggish? N 
L. Would you be very unhappy if you were prevented from making numerous social contacts? E 
analyses the 12 items were only a small part Table 3 
0 s aag 
x the total number of items factor analyzed. Analysis of Variance of the “Neuroticism” Scores 
- some ways the new set of factor loadings 
1S even more clear-cut than the original one. a SUNAT 
None of the E items has loadings on N as Source of Variance Squares df M.S, 
arge as .10, and none of the N items has rotat 20269.7775 1599 
oadings on E as large as .10; in the original asain effects A 
Study several loadings exceeded this figure. sex 995.4025 1 995.4025* 
e ‘tor struc- Class 311.5225 1 311,5225* 
to may conclude, then, that the fac pes 142,8025 1 142,8028* 
A has stood up well to reperkon: First order interactions 
. = i ni 
he Correlation between Extraversion and SCl Eea 
Curoticism is — .05; this is very similar to Sex: Age 02251 
€ correlations reported previously for our  Ctass:Age 13225 1 
Samples of men and women. Again, there- second order interactions 14.8225 1 
Ore, the figures from the present study bear Totals oa, ee 
ut i; * : : ions All interactions s $ 
fr : w an important direction the She All differences between 
om the original work. The split-half reli- erp 1502.4975 7 214,6425 


bilities (corrected) are .79 for N and .71 fOr Residual variance within groups 18767.2800 1592 11.7885 
} these values are acceptable for group com- 


* Signifies statistical significance at 5% level, 


Table 2 parisons. (Test-retest reliabilities on small 
groups have been found to be slightly, but 
not significantly, in excess of these figures.) 


Present Sample Original Sample 
pe St ola 


ltem = N. E Ni 
3 Ny X 2 Results of an analysis of variance for Neu- 
: Fi ‘= =10 å 19 -05 ie roticism and Extraversion scores respectively 
c 8 a 70 4 a ‘ss are reported in Tables 3 and 4. Significant 
3 ic ma ir ‘iz 66 10 differences due to some of the main effects ap- 
i so n D a TEE pear in the scores for both factors, but they 
G e 58 —.13 ae e i4 are more conspicuous on the N scores, where 
g 4 = z = mie they account for 7.417% of the total variance, 
J ~06 a _ 06 67 oe _6 than on the E scores, where they only account 
X a =o R E i7 83 for 0.94%. The sex difference is the greatest 
> A TE ama 38 —.08 jn relation to N and the only significant one 


16 H. J. Eysenck 


Table 4 


Analysis of Variance of the “Extraversion” Score 


Sum of 


Source of Variance Squares df m.s? 


Total 14263.8975 1599 
Main effects 


Sex 79,2100 1 


79.2100* 
Class 16.8100 1 16.8100 
Age 28.6225 1 28.6225 
First order interactions 
3.4225 1 
4.0000 1 
Class: Age 1,2100 1 
Second order interaction 9025 1 
Totals 
All interactions 9,5350 4 2.3838 
All differences between 
groups 134.1775 7 19,1682 
Residual variance within groups 141 29.7200 1592 8.8755 


* Signifies statistical significance at 5% level, 


in relation to E, On N, the women have a 
Score roughly 3 SD higher than the men (ie., 
women are less stable); on E, the men have 
a score roughly } SD higher than the women 
(ie, men are more extraverted). Class and 
age differences are also significant for N, the 
lower class and younger age groups being 
slightly more unstable emotionally by 4 SD 
and } SD, respectively. None of the interac- 
tions give rise to mean square variances sig- 
nificantly greater than the residual error; on 
the whole they tend to be small. In fact, 
most of the observed differences are slight 
and only significant because of the large num- 
ber of cases; little Psychological importance 
would appear to attach to any of them except 
the sex difference on N, which is large and in 
line with previous work (1), 

The mean scores for N and E, respectively, 
sample; corrections 
n the total Popula- 
reciatively different 


Dis- 
ufficiently normal to 


permit the use of correlational statistics,’ and 
the variances of the different groups are suffi- 
ciently homogeneous to permit analysis of 
variance to be carried out without transfor- 
mation. The variances for N are slightly 
higher than those for E, being 11.73 as com- 
pared with 8.83, 

A question regarding drinking habits was 
included in the questionnaire. A division was 
made between “drinkers,” i.e., those who 
drank frequently or sometimes, and ‘“non- 
drinkers,” i.e., those who drank very rarely 
or never. The N scores of these two groups 
were very similar, being — .37 as compared 
with .04; if anything, it appears that “non- 
drinkers” as here defined are very slightly 
more unstable than drinkers. The small size 
of the difference does not warrant our taking 
this conclusion too seriously. The E scores of 
the two Sroups are very significantly differ- 
ent, the scores being 2.48 and 1.55. Thus 
drinkers are about 4 SD more extraverted 
than nondrinkers, 


Summary 


An investigation has been carried out t0 
demonstrate the possibility of constructing 
short reliable Personality questionnaires which 
might be of use in industrial and applied 
work, and which could be administered in the 
usual interview situation. 

An analytic sample of 1,600 adult subjects, 
equally divided as to age, sex and social class; 
Was selected on a quota-sampling basis and 
administered a 12-item questionnaire. Six 
questions bearing on neurotism and 6 ques- 
tions bearing on extraversion had been se- 
lected from a previous item-analytic and fac 
tor-analytic study in order to cross-validaté 
certain conclusions, Correlations were calcu- 
lated between the 12 items, and a factor 
analysis performed: this disclosed two ot” 
thogonal factors clearly identical with thosé 
of the previous analysis. Analysis of vark 
els sp 

* The distribution of the E 
negative skew, but it is doub 


to make desirable the 
types of transformation, 


A le 
scores has a noticeabl 
tful if this is suffices 
use of logarithmic or ot 


A Short Questionnaire 


ance gave evidence of certain score differences 
due to sex, age, and social class, although 
With the exception of the sex differences these 
were of minor importance. The 12-item ques- 
tionnaire was found to have reasonable reli- 
ability, and the two personality variables 
measured by it were found to be uncorrelated. 
The practical usefulness of instruments of 
this kind was discussed. 


Received March 1, 1957. 


References 


17 


1. Eysenck, H. J. The questionnaire measurement 
of neuroticism and extraversion. Rivista di 


Psychologia, 1956, 50, 113-140. 


- Eysenck, H. J. The dynamics of anxiety and 
hysteria. London: Routledge & Kegan Paul, 


1957. 


. Eysenck, H. J. The psychology of politics. Lon- 


don: Routledge & Kegan Paul, 1954. 
. Thurstone, L. L. Multiple factor analysis, 
cago: Univer. of Chicago Press, 1947, 


Chi- 


al of Applied Psychology 
ae, No. 1, 1958 


Prehension Force as a Measure of Psychomotor Skill for 
Bare and Gloved Hands *? 


John Lyman and Hilde Groth 


University of California, Los Angeles 


Data regarding thumb-fingertip grasp forces 
and the variables affecting these forces dur- 
ing manipulation are of both theoretical and 
practical importance to the psychology of 
motor skills. For theory, they may aid in 
defining the role of sensory feedback loops in 
manual activities. For application, such data 
are of potential value to the design of termi- 
nal devices for artificial arms, to the design of 
protective hand coverings and to the design 
of perceptual-motor tasks by industrial en- 
gineers. 

Static measurements of maximum grasp 
force with the thumb opposing the index and 
middle fingers have been reported by Inman 
and Eberhardt for eight Ss as a function of 
arm-hand angle and the distance between the 
thumb and the Opposing fingers (4). Their 
results, which were not tested for statistical 
reliability, suggest that distance between the 
thumb and fingers over the range from one- 
half to three inches was not an important 
variable, but that arm-hand angle was, with 
a mean maximum grasp force of approxi- 
mately 17.5 pounds at angle of 145°. No dy- 
namic measurements of normal finger-thumb 
grasp forces are known to us. 

In view of the Sparse treatment of this topic 
in the literature about this aspect of manipu- 
lative skill, it was felt that a challenging 
problem in methodology existed for making 
the measurements, and that the dynamic 
measurement of prehension forces might prove 
to be a valuable index to perceptual aspects 
of motor skill which measurements of move- 
ment time and patterns could not take into 
account. Accordingly, on a Psychomotor task 


+ This investigation was supported by QM Con- 
tract No. DA 44-109-9M-1531 between the U. S. 
Army QM. Corps and the University of California, 
Los Angeles. The opinions expressed are those of 
the writers and do not necessarily reflect those of 
the Contracting agency. 

2Some of the experimental results were presented 
by one of the authors at the APA. Convention in San 
Francisco, September, 1955. 


18 


requiring discrete movements, observations 
were made on four variables that might be 
presumed to affect prehension force. These 
variables were, weight of the object, distance 
moved, direction moved, and the effect of 
protective handcovering. 


Method 


Apparatus. The work space which is illustrated in 
Fig. 1 consisted of a semi-circular piece of one-half 
inch plyboard, four feet in diameter with one and 
five-eighths inch holes in it, These holes were equally 
spaced on radii from a central hole at 0°, 30°, 60°, 
90°, 120°, 150°, and 180°, Each direction was in- 
dicated by decal letters from A to G mounted on the 
work board, starting at the S’s left. The work table 
was 30 inches high. 

The cylindrical object which the S manipulated 
was made of lucite rendered opaque by means of 
black paint. It was hollow, and different weights 
could be inserted into it, Prehension force was 
measured by means of a special variable capacitor 
force transducer which is known as the Franklin In- 
stitute Laboratories Pressure Indicating Patch, oF 
“filpip” (1). The device was calibrated by means 


Fic. 1. General view of work Space and apparatus: 


~ 


Prehension Force of Bare and Gloved Hands 19 


of weights hung from it in a calibration jig designed 
for the purpose. Forces during experimental runs 
Were recorded continuously on a direct writing oscil- 
lograph. Since the measurement system remained 
relatively stable, checks on the calibration were made 
only before each S started his series and after he had 
completed it. 

Experimental design. The effects on prehension 
force of the following independent variables were in- 
Vestigated in a treatment by Ss factorial design: 

1. Weight, consisting of five levels; 18.1 gms., 45.4 
gms., 118.0 gms., 308.5 gms., and 426.4 gms. 

2. Direction, consisting of seven levels: 0°, 30°, 
60°, 90°, 120°, 150°, and 180°. 

3. Distance, consisting of three levels: 9.0, 30.8, 
and 52.6 cm. as measured from the center hole on 
the board. This corresponded to the first, middle, 
and outermost holes in the board at each angle. 
The locations were clearly marked by means of paper 
indicators at the bottom of each hole as distances 
one, two, and three respectively. 

4. Hand coverings, consisting of the bare hand as 
4 control, latex surgeons’ gloves for a light hand 
Covering, and five finger leather Army gloves with 
15% wool, 25% nylon knitted liners for the heavy 

and covering conditions. No attempt was made to 
quantify the concept of light and heavy hand cover- 
ing in terms of specific properties of the gloves. 

Subjects. The experimental Ss, solicited from a 
Classroom, were six male sophomore engineering stu- 
dents, All Ss were righthanded and ranged in age 
from 18 to 28 years. 

Procedure. Each S was instructed to grasp the 
est cylinder with his thumb and first two fingers, 
Pick it up from the center hole, place it in the lo- 
ition designated by the Æ, release it, then regrasp 
it and replace it in the center hole. He was told 
that the device contained a sensitive measuring in- 
Strument and that he was to work at whatever he 
Considered a comfortable speed. At no time was it 
Suggested that force was being measured. A post- 
*Periment interview indicated that most of the Ss 

Ought performance time was the criterion measure. 


b mentioned force as a possible measure. Tie 
are hand, light hand covering, and heavy E 
a dif- 


fing conditions were given to each S in m 
re nt order so that all six possible orders were A 
t; Sented by the six Ss. The sequence of weight, í n 
Nee, and direction combinations was determine 
techni combined random number and card og 
r; nique for each S. From these individua! ie 
kea the E called out discrete commands renin 
and of the alpha-numeric code designating ie oo 
Dra distance. The S was allowed approxima = 

arice trials before proceeding with the 315 be, 
Stone ental trials, Once the experiment va mn 
ings , Were made only to change the han 


T i i i i the hollow 
linden: to insert different weights into 


Results è and Discussion 


Typically, on grasping the cylinder, a sharp 
peak of force occurred, followed by a rapid 
adjustment to a lower level during transport. 
The criterion measure used for the present 
analysis was the maximum force on the re- 
grasp part of the cycle, that is, the peak force 
used by the S when he picked the cylinder up 
to return it to the center hole. It was read 
to the nearest millimeter, and converted to 
grams from a calibration chart. 

A typical section of a record obtained for 
one of the Ss is shown in Fig. 2. 


Typical record of prehension force meas- 
(Letter designates direction; number 


Fic. 2. 
urements. 
designates distance.) 


An analysis of variance was made of the 
data in which the main effects were tested 
against the simple interactions between the 
respective main effect and Ss. The first or-- 
der interactions were tested against the second 
order interactions involving Ss and the same 
procedure was employed for testing the sig- 
nificance of the higher order interactions. 
The significance level was set at p< .001 for 
all tests to take care of the fact that the 
standard deviations were rather large rela- 
tive to the means (2). All main effects ex- 
cept direction of movement were significant. 
The triple interaction between handcoverings, 
object weight, and Ss was also significant, in- 
dicating a need for additional investigation of 
these variables. 

Figure 3 is a plot of the mean values for 
the main effects found to be statistically sig- 
nificant. Under the conditions of this experi- 
ment, it is apparent that object weight is the 


a 

8 Tables of the analysis of variance and the mean 
values and standard deviations of the main effects 
have been deposited with the American Documenta- 
tion Institute. Order Document No. 5428 from ADT 
Publications Project, Photoduplication Service, Li- 
brary of Congress, Washington 25, D. C., remitting 
in advance $1.25 for microfilm, or $1.25 for photo- 
copies. Make checks payable to Chief, Photodupli- 
cation Service, Library of Congress. 


20 John Lyman and Hilde Groth 


1000 


PREHENSION FORCE (gm) 


NONE LT HV o 
COVERING 


200 40 
DISTANCE (mm) 


o 200 
WEIGHT (gm) 


400 


Fic. 3. Mean values for statistically significant variables. 


most extensive contributor to the stimulus 
complex determining prehension force. This 
result was expected in conformity with com- 
mon experience, 

Distance has the least effect, ranging from 
576 to 794 grams over the conditions of the 
experiment. The effect of distance is tenta- 
tively attributed to overcompensation for the 
increased muscle tension required as the arm 
is extended. 

It is of interest to note that 
surgeon’s gloves a 
force, 


even light 
Ppear to affect prehension 
This leads us to propose that the 
modus operandi for the effect of handcover- 
ings on finger manipulation is to distort tactile 
cues. It may be speculated that this distor- 
tion takes the form of lowered tactile sensi- 
tivity and false cues from nonlinear transmis- 
sion of information from the surface of the 
handcovering. As indicated by another study 
in this laboratory, the precise nature of the 
effect appears to depend on such Properties 
as the friction and compressibility of the 
handcovering materials over the fingers (3): 
It seems reasonable to suppose that a mini- 
mum amount of compression of the glove ma- 
terials is necessary to transmit the knowledge 
to the wearer that the object is being held 
securely. For very light objects, such as the 
empty cylinder, the normally lower weight 
discrimination capacity expected for each § 
Suggests that weight, as such, is probably not 
an important cue to the amount of prehension 
force needed to secure grasp. Force may be 


applied at an arbitrary level to prevent slip- 
ping between the object-glove-hand interfaces. 
The variable of weight becomes important 
when the friction between the glove and hand 
and/or glove and object is exceeded so that 
slipping begins. As the weight of the object 
is increased, more prehension force is required 
to assure secure grasp. To a point the glove 
materials will compress as a function of the 
weight, after which the compression will be 
maximum, 

This exploratory experiment seems to sug- 
gest strongly that variations in prehension 
force as affected by physical variables inher- 
ent in the task and possibly also by changes 
in the amount of tactual sensory information 
received by the operator during task perform- 
ance may be of critical importance in manual 
skill. We feel, therefore, that some measure 
of prehension force has potential value as an- 
other index of motor skill. 


Summary 


An exploratory investigation of the effects 
of certain physical variables upon changes i? 
thumb-fingertip grasp forces during manipU- 
lation has been conducted for a light psycho- 
motor task. The rationale for this study was 
based upon the opinion that a dynamic meas- 
urement of prehension forces might provide 
information about perceptual aspects of mo- 


tor skills not accounted for by other perform- 
ance measures. 


Prehension Force of Bare and Gloved Hands 21 


The task consisted of simple grasp, trans- 
Port, and release of a cylindrical object into 
designated holes of a formboard. This object 
was instrumented with a pressure transducer 
Permitting continuous recording of grasp force 
variations. The task was administered in a 
factorial treatment by subject design to six 
engineering students and the following vari- 
ables were investigated at several levels: (a) 
handcoverings; (b) object weight; (c) dis- 
tance moved; (d) direction of movement. 
Analysis of variance indicated that hand- 
Coverings, weight, and distance exert a sig- 
nificant effect upon prehension force during 
a given task. 

It was concluded that this measurement 
seems to have potential value as an index of 
Motor skills in that it is probably sensitive 
to changes in the amount of tactile sensory 


information available as well as to physical 
variables of the task. 


Received March 22, 1957. 


References 


1. Frank, W. E., & Gibson, R. J. A new pressure- 
sensing instrument. J. Franklin Institute, 
1954, 258, 21-30. 

2. Lindquist, E. F. Design and analysis of experi- 
ments in psychology and education. Boston: 
Houghton Mifflin, 1953. 

3. Sheridan, T. B. An experimental study of physi- 

cal criteria for evaluating handcovering de- 

signs. Master’s thesis, Univer. of Calif., 1954. 

4. University of California (Berkeley), Prosthetic 
Devices Research Project, Subcontractor’s Fi- 
nal Report to Committee on Artificial Limbs, 
National Research Council. Fundamental 
studies on human locomotion and other in- 
formation relating to design of artificial limbs, 
1947, Vol. II. 


J al of Applied Psychology 
Vol. 42, No. 1, 1958 


The Edwards Personal Preference Schedule (EPPS) and 
Fakability * 


Bernard Borislow 


University of Pennsylvania 


Edwards (2) has constructed the Edwards 
Personal Preference Schedule to measure the 
magnitude of fifteen “needs” (after Murray, 
4) assumed to be operative in the “normal 
adult personality.” 

The EPPS is a binary forced-choice objec- 
tive-type inventory the construction of which 
is based upon a novel and ingenious matching 
technique in an attempt to reduce the ap- 
pearance of respondent “faking.” Based upon 
earlier work (1), Edwards has equated his 
item alternatives on the basis of a social de- 
sirability continuum still maintaining the dis- 
criminatory power between the alternatives 
available in any one item of the inventory. 
In this way he hoped to eliminate choices 
that were made on the basis of the greater 
social desirability of one of the alternatives. 
It appears that social desirability is a cri- 
terion which a respondent can use when he 
attempts to “fake” his answers to a person- 
ality inventory. It has been assumed that 
when the social desirability of the alterna- 
tives cannot be discriminated the respondent 
will find great difficulty in misrepresenting 
his “personality traits,” 

Recently, Rosen (6) has introduced an- 
other aspect of desirability into the area of 
inventory fakability termed personal desir- 
ability. This idea is not entirely new, Rog- 
ers and Dymond (5) used the concept of the 
“ideal self” in evaluating the outcomes of 
psychotherapy. Essentially, personal desir- 
ability is the choice of traits on the basis of 
“how the individual would like to be” 
than “how the individual thinks he is” (self- 
appraisal). We are confronted with at least 
two aspects of fakability or desirability—so- 
cial and personal. 

We must introduce an 
this point. 


rather 


other consideration at 
The fakability of a personality 


1The author gratefully acknow 


and encouragement of M, S. Vitel 
tion of this report. 


ledges the counsel 
les in the prepara- 


22 


inventory offers no real problem if the falsi- 
fied responses can be detected. If a respond- 
ent has “faked” his answers and this behav- 
ior can be identified we have gained some 
information about the respondent’s “person- 
ality” even though we must disregard the re- 
sults of the inventory. 

That Edwards has performed a remarkable 
task with a great deal of ingenuity in con- 
structing the EPPS cannot be disputed. How- 
ever, we are confronted with the question of 
how well Edwards has succeeded in eliminat- 
ing the appearance of “faking” on the EPPS 
and, more important, to the extent that “fak- 
ing” does occur can such behavior be de- 
tected. -A 

Certain definitions introduced now will 
prove helpful. The profile correlation is a0 
index of the relationship between the profiles 
yielded by two individual administrations 0 
the EPPS to the same subject. It is derived 
by ranking and correlating the T scores ob- 
tained for each of the fifteen scales from ore 
administration of the EPPS with the T scores 
obtained from a second administration. The 
consistency score is the only direct and m: 
mediate device for determining the “honesty 
of the respondent’s behavior. It is base 
upon fifteen duplicated items “built into” the 
inventory with a view toward using the Te- 
sponses as a check on the consistency of thé 
respondent in answering the inventory. The 
profile stability coeficient is an index of me 
uniformity of the responses (that go to ma 
up each of the fifteen scales) distribute 
across the answer sheet. Since half of th® 
Taw score for each scale is derived from tbe 
tows and half from the columns of the a” 
Swer sheet, it is possible to correlate thes? 
half scores by their respective ranks hori 
zontally and vertically to yield the profile 
stability coefficient. This is another, althous” 
more indirect, check on respondent bias. FE 
nally, the group profile is the profile derive 


EPPS and Fakability 


from the pooling of individual subject profiles 
(by a procedure to be discussed) in the same 
treatment group. 


Hypotheses Tested 


1. Profile correlations derived from intra- 
subject comparisons of the administration of 
the EPPS under standard conditions (self- 
appraisal) and then under an experimental 
Condition (a “mental set” to “fake” either 
Socially or personally desirable traits) will 
differ significantly from those obtained from 
a comparison of two standard (self-appraisal) 
administrations. 

2. Consistency scores derived from the ad- 
ministration of the EPPS under a “mental 
Set” to “fake” (a) socially desirable or (b) 
Personally desirable traits will differ signifi- 
cantly from those obtained under standard 
Conditions of self-appraisal. 

3. Profile stability coefficients derived from 
the administration of the EPPS under a 
“mental set” to “fake” (a) socially desirable 
or (b) personally desirable traits will differ 
Significantly from those obtained under stand- 
ard conditions of self-appraisal. 

4. The number of intra-subject response 
changes that occur between the administra- 
tion of the EPPS under an initial standard 
(self-appraisal) administration and then un- 
er an experimental administration (either 
Social or personal desirability “set”) will 
differ significantly from the changes that oc- 
Cur between two standard (self-appraisal) 
administrations, 

5. The group profile derived from the ad- 
ministration of the EPPS under a “mental 
Set” to “fake” socially desirable traits will 
differ significantly from that obtained under 
a “mental set” to “fake” personally desirable 
traits, 


Method 


Subjects, The experimental sample 

dena dom from a larger group of volu 
nentary psychology course during A 

Session of 1956, at the University of Penney ivan: 

ox, Sample was divided into three groups by ® EN 

pha Procedure! after the completion of tl ne ) 
ase of the study (initial self-appraisal testing? - 

berg Vith one restriction, Equal or nearly equal eal 

Of males and females were assigned 


was selected 
nteers in an 
the summer 


23 


The three groups were labeled Control, Social Desir- 
ability (SD) and Personal Desirability (PD) groups 
according to the procedure (discussed below) of the 
second phase of the experiment. The Control group 
consisted of three males and three females with an 
age range of 18-24 years and with a median age of 
19 years. The SD group consisted of three males 
and three females with an age range of 18-28 years 
and with a median age of 21.5 years. The PD group 
consisted of four males* and three females with an 
age range of 20-27 years and with a median age of 
20 years. The sex and age compositions of the three 
groups do not differ significantly and by definition 
are samples drawn from the same population. 

Procedure and Instructions. Phase 1 consisted of 
administering the EPPS to three groups of size W 
= 6, N=6, and N =7, using the standard adminis- 
tration instructions (self-appraisal). Each subject 
was given a code designation which he wrote on the 
answer sheet in place of his name along with his sex 
and age. The same code was used in Phase 2 by 
each subject so that Phase 1 and Phase 2 profiles 
could be matched still keeping the responses anony- 
mous. The entire group of N=19 was then sub- 
divided into the Control, SD, and PD groups. 

Phase 2 consisted of readministering the EPPS to 
all the subjects under directions appropriate to the 
group to which they were assigned. Prior to the 
second administration of the EPPS to the Control 
group, they were instructed that the second stand- 
ard administration was important to the purposes of 
the study so as to maintain an adequate level of 
motivation. Prior to the second administration of 
the EPPS to the SD group, they were informed 
that they should try to respond as they believed a 
“perfect individual characterized by those traits that 
society considers highly desirable” would respond. 
Prior to the second administration of the EPPS to 
the PD group, they were informed that they should 
try to respond according to how they would “like 
to be” rather than how they “actually are.” 

An interval of two weeks elapsed between Phase 1 


and Phase 2. 


Results 


Table 1 shows the profile correlations ob- 
tained from the three groups.’ For each sub- 
ject, the profile correlation indicates the de- 
gree of relationship between his initial self- 
appraisal profile (Phase 1) and his second 
profile taken under either self-appraisal, so- 
cial desirability, or personal desirability in- 
structions depending upon the group to which 
he was assigned (Phase 2). 

A significant difference (P = .002, Mann- 


2 The additional male reported for the experiment 
“by accident” and it was decided to use him. 
8Jn essence, those shown for the Control 


ey ape, ri grou 
are test-retest reliability coefficients. p 


24 Bernard Borislow 


Table 1 


Individual Profile Correlations 


Control SD PD 
.91 -68 -68 
85 4 68 
185 27 -65 
77 .06 46 
.70 .05 St 
65 .03 06 

—.03 


Whitney U test) exists between the Control 
and SD groups and between the Control and 
PD groups (P < -004, Mann-Whitney U test). 
There is no significant difference between the 
two experimental groups, SD and PD (P > 
-20, Mann-Whitney U test). Therefore, Hy- 
pothesis 1 is held tenable. The influence of 
a mental set to “fake,” either under social 
or personal desirability instructions, has pro- 
duced Personality profiles significantly dif- 


ferent from Profiles obtained under self-ap- 
praisal conditions, 


Table 2 shows the c 
puted for both adm 
for each subject.’ 


The consistency score derived from the first 
administration of the EPPS was compared 
with the score derived from the second ad- 
ministration for each subject within the three 
groups. There were no Statistically significant 
(P's > .05, Wilcoxon matched-pairs signed- 
ranks test) or Practical differences. 

Comparisons between groups under P 
(initial self-appraisal conditi 
statistically significant (P's > 30, Mann- 
Whitney U test) difference between the sam- 
ples of the consistency scores, 

Comparisons between Sroups under Phase 2 
indicated that both the Control and PD groups 
onsistent than the SD 


hitney test); there 


onsistency scores com- 
inistrations of the EPPS 


4 Even though the groups are listed se 
hase 1 consisted of a uniform administrati 


EPPS to all subjects under standard instr 
self-appraisal, 


Parately, 
ion of the 
uctions of 


It appears as though self-appraisal and per- 
sonal desirability mental sets result in signifi- 
cantly more consistent responses than does a 
social desirability set. This result is not sur- 
prising if we recall that Edwards constructed 
the EPPS so that social desirability as a cri- 
terion for “faking” behavior would be opti- 
mally eliminated. 

Further observation of the consistency 
scores shows that five of the scores fall be- 
low Edwards’ lower limit of acceptability 
(score below 10 indicates an inconsistent and 
therefore questionable profile). Four of those 
Scores come from the group of 25 profiles de- 
rived from self-appraisals. Only one comes 
from the group of 13 faked profiles. ? 

Therefore, on the basis of practical signifi- 
cance (in addition to the Statistically non- 
significant findings between the Control and 
PD groups) we must say that the consistency 
Score cannot discriminate faked profiles from 
self-appraisal profiles, Hypothesis 2 is re- 
jected. f 

Table 3 shows the profile stability coeffi- 
cients obtained for all subjects under both 
administrations of the EPPS.’ 

All comparisons between groups for both 
Phase 1 and Phase 2 yield nonsignificant dif- 
ferences for samples of profile stability coeffi- 
cients (P’s > .05, Mann-Whitney U test). 

Therefore, Hypothesis 3 must be rejected. 
Profile stability does not deteriorate unde! 
faked conditions when compared to self-aP~ 
praisal results. In fact, there is some evi- 
dence to indicate that faking under a per- 
sonal desirability mental set yields a profile 


Table 2 


Consistency Scores 


Phase 1 Phase 2 = 
Control sp PD Control SD po 


15 14 13 


14 12 15 
3 B g 


u4 å n 14 


12 12 12 14 11 13 
12 12 12 12 10 13 
11 10 10 12 10 13 

9 c 10 6 8 12 

8 11 
Se a 


5 See Footnote 4, 


EPPS and Fakability 25 


significantly more stable than does the self- 
appraisal condition. This evidence is derived 
from a within-group comparison for the PD 
Sroup. Profiles were more stable under the 
Personal desirability condition than under the 
self-appraisal condition for this group (P= 
05, Wilcoxon matched-pairs signed-ranks 
test). 

Table 4 shows the number of changed re- 
Sponses (number of items answered differ- 
ently in Phase 2 when compared to Phase 1) 
for each subject in the three groups. 

Significantly more responses per subject 
Were changed by Group SD than by the Con- 
trol group (P < .002, Mann-Whitney U test). 
Similarly, more responses per subject were 
changed by Group PD than by the Control 
group (P < .01, Mann-Whitney U test). 
There is no difference between the experi- 
mental groups, SD and PD (P > .50, Mann- 
Whitney U test). 

Therefore, Hypothesis 4 is held tenable. 
Faking produces more response changes per 
subject than does a re-self-appraisal. This 
result seems to þe an obvious one. If faking 
Produces different profiles, the only way this 
can come about is by way of responding 
differently, However, if responses changed 
8teatly for the Control group then we would 

© unable to attribute response changes in the 
experimental groups to the experimental vari- 
ables. Tf there were no differences in changed- 
response scores between the Control and ex- 
Perimental groups we would then have to at- 
tribute response changes in these latter groups 


Table 3 
Profile Stability Coefficients 
SS — 
a 1 7 Phase 2 
“ntol SD PD Control SD PD 
S oo g s4 9 86 
SO gso st 32 86 3l 
D 63 4 7o 80 -78 
oe 59.63 3 W M 
a ae 59 54 70 
40 4o 38 uo 3 0 
34 59 


Table 4 
Changed Response Scores 


Control SD PD 
47 76 87 
38 76 76 
34 75 52 
32 73 51 
32 62 50 
30 47 42 

38 


to error variance (unreliability of the instru- 
ment as well as intra-organismic changes) .° 

In order to determine if a mental set to 
fake socially desirable responses is different 
from a mental set to fake personally desirable 
responses we would need significant within- 
groups homogeneity and between-groups het- 
erogeneity in a comparison of Groups SD and 
PD. 

A low but significant relationship exists be- 
tween the individual profiles of the subjects 
in the SD group derived from the faked situa- 
tion (coefficient of concordance, W = .382, 
P<.01). Similarly, a low but significant 
relationship exists between the individual pro- 
files of the subjects in the PD group derived 
from the faked situation (coefficient of con- 
cordance, W = .255, P < .05). Using the 
sums of ranks for each scale under the SD 
and the PD conditions (derived from the 
above calculations of concordance) we can 
rank the scales under each condition. This 
is probably the best estimate of the true psy- 
chological ranking for each group across all 
scales. There is no significant difference be- 
tween the scale rankings of Groups SD and 
PD (P > .05, Wilcoxon matched-pairs signed- 
ranks test). 

Therefore, Hypothesis 5 must be rejected. 
We cannot say that faking under a social de- 
sirability mental set is different from faking 
under a personal desirability mental set on 
the EPPS. 

Another interesting result has been obtained 
which casts doubt upon the use of the con- 


8 See Discussion section for further implications of 
response changes. 


26 Bernard Borislow 


sistency score and the profile stability coeffi- 
cient as measures of response coherency or 
fakability. There is no significant relation- 
ship between profile stability coefficients and 
consistency scores (Spearman rho coefficient 
= 35, P > .05). This result is based upon 
the 19 profiles derived from self-appraisal 
conditions (Phase 1) where all subjects in 
the study were subjected to identical condi- 
ditions taking the EPPS under standard test 
instructions and anonymity. There were no 
significant differences between men and women 
either on samples of profile stability coeffi- 
cients or on consistency scores (P’s > .10, 
Mann-Whitney U test). 

Finally, it should be noted that Edwards 
has forcefully eliminated the effects of social 
desirability as an influential determiner of 
fakability on the EPPS. Although the SD 


group was able to coherent] 


ly falsify its 
profile results, only a low relationship (W = 


382) exists between the profile patterns of 
the subjects in that group. That the rela- 
tionship is low indicates that different item 
alternatives were chosen as being socially de- 
sirable for the different subjects. Further, 
even though 5 of the 6 Consistency scores for 
the SD group (under Phase 2) are considered 
“acceptable” (see Table 2), these scores are 
generally lower than either the Control or PD 
groups. This indicates that, when a respond- 
ent attempts to choose socially desirable an- 
Swers on the EPPS, his responses are apt to 


be less consistent than if he were to answer 
on a self-appraisal or Personal desirability 
asis. z 


Discussion 


ured by the Eq 


this argumentative question: In spite of the 
fact that the experimental groups showed a 
significantly greater number of changed re- 
sponses than the Control group, is it not pos: 
sible that the SD and PD group subjects 
changed their responses merely because they 
interpreted their instructions to mean “change 
your answers from the answers you gave last 
time”? That is, the subjects might not have 
been able to abide by social or personal de- 
sirability instructions and were forced to 
choose alternatives on a random basis. This 
would also yield a great number of change 
responses. 

There is overwhelming evidence that the 
SD and PD subjects were not responding 0? 
a purely random basis. The very nature 0 
the consistency score (number of 15 dupli- 
cated items responded to identically) on 
rides the proposed argument. The probabi i 
ity of the 13 observed consistency scores fog 
Groups SD and PD (under Phase 2, the A 
perimental condition) occurring by chance } 
less than one in a thousand (x? = 123.9, 
df; see 3, pp. 103-105), d 

Although Hypothesis 5 has been rejected 
there is evidence to believe that social dest 
ability and personal desirability are distin! 
concepts and were successfully induced 4” 
manipulated in this study. The first bit ° 
evidence is the fact that the choice of 2 
ternatives using a social desirability criteri? 
seems to be more difficult than the use © S; 
personal desirability criterion on the EP? K 
the responses of subjects under the PD Ca 
dition were significantly more consistent t a 
under the SD condition. Secondly, the, cg aa 
cordance of profiles under the two conditio F, 
is somewhat different; more concordance ed: 
ists for the SD profiles, as would be expect on 
Finally, a negative (non-significant) relat! ed 
ship is present between group profiles, b® so 


on sums of scale ranks, between the | 
groups, 


Summary and Conclusions 


my, ii 
This study was designed to determin? je 
the Edwards Per 


Sonal Preference SCheC on 
(EPPS) could be “fakeq” without detect y 
in a laboratory situation using college d 
dents as subjects under anonymous ©? 


EPPS and Fakability 27 


tions. Nineteen subjects (10 men and 9 
women) took the EPPS under standard con- 
ditions (self-appraisal). Two weeks later, 
three groups consisting of approximately an 
equal number of men and women were ran- 
domly constituted; one was the Control group 
(self-appraisal retest) and two were experi- 
mental groups (Social Desirability retest 
group and Personal Desirability retest group). 
Consistency scores, profile stability coeff- 
cients, individual profile correlations were 
computed and statistical tests were employed 
to test certain hypotheses. The following 
Conclusions seem warranted: 

1. The Edwards Personal Preference Sched- 
ule can be faked under structured personal 
and social desirability instructions. 

2. The consistency score and the profile 
stability coefficient are not adequate indices 
of inventory fakability. 

3. There is evidence that differential cri- 
teria exist for fakability in terms of desir- 
ability of response alternatives—social and 
personal: 

4. The Edwards Personal Preference Sched- 


ule is not greatly susceptible to the influence 
of fakabHity in terms of choice of socially de- 
sirable items, per se. 


Received April 1, 1957. 


References 


1. Edwards, A. L. The relationship between the 
judged desirability of a trait and the prob- 
ability that the trait will be endorsed. J. 
appl. Psychol., 1953, 37, 90-93. 

2. Edwards, A. L. Manual for the Edwards Per- 
sonal Preference Schedule. New York: Psy- 
chological Corp., 1954. 

3. Fisher, R. A. Statistical Methods for Research 
Workers (5th ed.). London: Oliver & Boyd, 
1934, 

4. Murray, H. A. Explorations in Personality. New 
York: Oxford Univer. Press, 1938. 

. Rogers, C. R., & Dymond, R. F. Psychotherapy 
and Personality Change. Chicago: Univer. 
Chicago Press, 1955. 

6. Rosen, E. Self-appraisal, personal desirability 
and perceived social desirability of person- 
ality traits. J. abnorm. soc. Psychol., 1956, 
52, 151-158. 

7. Siegel, S. Nonparametric Statistics for the Be- 
havioral Sciences. New York: McGraw-Hill, 
1956. 


u 


Journal of Applied Psychology 
Vol. 42, No. 1, 1958 


Learning, Prediction, and Readability * 


Herbert Rubenstein * and Murray Aborn ° 


Air Force Personnel and Training Research Center 


In the course of investigating the relation- 
ship between the learning of artificial language 
materials of different degrees of organization 
and the learning of English, we obtained re- 
call and word-prediction scores for a number 
of English passages of approximately equal 
length. It seemed to us that these data af- 
forded an excellent opportunity for determin- 
ing the interrelationship of the readability of 
the passages, the ease with which they were 
learned, and the degree to which their con- 
stituent words were predictable. There was 
reason to believe that significantly positive 
intercorrelations would be found. Consider 
learning and prediction. If we follow infor- 
mation theorists in accepting the close con- 
nection between prediction and amount of in- 


formation, then the findings of Miller and 
Selfridge (5) or those of 


the present authors 
(1, 7) indicate that a substantial correlation 
between amount learned and success in pre- 
diction should exist. As for learning and 
readability, the likeliho 


od that they are cor- 
related is strongly suggested by the fact that 
both are related to ease of comprehension. 
Reed (6), for exampl 


passage taken from th 
addition, it may be 
nation power of readi 
initially tested agai 


nst a criterion of compre- 
hensibility. With Tegard to Prediction and 
readability, Taylor’s (8) study, though in- 
volving only small Stoups of Ss and a very 
a 


1 This report is based on work done under ARDC 
Project No. 7730, Task No. 17125, in support of the 
~ research and development i 


production, translation, publication, “use, 
Posal in whole or in Part by or for the United States 
z0vernment, 
* Now at The Operational Applications Labora: 
Air Force Cambridge Research Center, Bolling a? 
orce Base, Washington 25, D. C f 
2 Now at The 


Industrial College of 
Forces, Washington, A-G: "SE oF the Armed 


small number of language samples, indicates 
that “Cloze” procedure—essentially a meas- 
ure of predictability based upon a knowledge 
of both the preceding and succeeding con- 
text—tends to rank prose passages of differ- 
ent levels of difficulty in about the same way 
as the readability ratings of the two most 
widely used formulas. For example, Taylor 
obtained a correlation of .46 between Cloze 
and Dale-Chall rankings even when his data 
consisted of passages deliberately selected to 
amplify the weaknesses of current methods of 
measuring readability. 


Materials 


Thirty passages, each 
words in length, 
widely va 
(10 pass: 
Saturday 


approximately 200 
were selected from such 
tying sources as children’s stories 
ages), popular magazines of the 
Evening Post variety (10 passages); 
higher quality Magazines (e.g, The New 
Yorker), scientific texts, and philosophical 
writings (10 Passages). Each passage was 
reproduced on a single sheet of paper, single 
spaced, each sentence beginning on a new 
line (for purposes of reading similarity with 
other materials). The Passages bore no 
identification regarding author, source, oF 
level of reading difficulty, 


Readability 
Two measures of 


infrequent words according tO 
Dale-Chall, These same 100 wo 


to obtain the 
the hundredth 


r 


Learning, Prediction, and Readability 29 


Table 1 


Summary Statistics for the Variables 


Range of 
Variable Scores* Mean SD 
Flesch Readability 97.4- 1.0 56.93 23.65 
Dale-Chall Readability 4.2-10.7 744 1.96 
Amount Learned 27.4-10.1 18.50 5.12 
Prediction 14.0- 6.4 9.03 2.07 


a From “easy" to “difficult.” 


brought the sample size closer to 100. The 
means and standard deviations of the dis- 
tributions of readability scores are shown in 
Table 1. 


Amount Learned 


Subjects. One hundred and twenty-five col- 
lege students (82 males, 43 females; 26 F resh- 
men, 40 Sophomores, 27 Juniors, 25 Seniors, 
2 Graduate, and 5 unclassified students) were 
selected for participation in a learning experi- 
ment involving protracted training in the mas- 
tery of artificial language materials and in the 
memorization of language samples (both arti- 
ficial and English) for five different study- 
time intervals. These 125 students were 
screened from a recruitment of about twice 
that number on the basis of age, level of edu- 
cation, and other traits and abilities as meas- 
ured by the following tests: the ACE ee 
Edition), the Digit Span subtest of the Wechs- 
ler-Bellevue Intelligence Scale, the Minnesota 
Multiphasic Personality Inventory, and the 
Clyde Projective Test. The selection pro- 
cedures were aimed at obtaining a group of 
Ss somewhat heterogeneous with regard to 
intelligence and rote memory, but relatively 
homogeneous with regard to motivation to 
complete the experiment and the capacity to 
perform with reasonable consistency under 
stress, 

Training. The Ss attended 


training sessions distributed ov 
CC 

+The training and testing of s 
under Contract AF 41(657)-59 wi 
search Foundation, Inc., Alabama Polyte ae 
tute, Auburn, Alabama. Contract activities wer i 
der the direction of Willard H. Nelson, Prneipa 
Investigator, and Virginia Zachert, Research Super- 
visor, 


36 hour-long 
er a period of 


Ss was carried out 
th the Auburn Re- 
echnic Insti- 


about 10 weeks. Seven of these sessions were 
devoted to practice in the memorization of 
English passages of the same length and of 
the same range of difficulty as the 30 experi- 
mental passages described above. All Ss re- 
ceived the same amount of practice in five 
study-time intervals: one-half minute, one 
minute, two minutes, three minutes, and four 
minutes. Practice passages were placed in 
front of the Ss face down, and the study 
time to be allowed for memorization was an- 
nounced. At the signal to start, Ss turned the 
passage over and began memorizing. They 
had been coached in scanning passages from 
the first word on—attempting to learn in nor- 
mal reading sequence rather than by skipping 
around. At the signal to stop, Ss turned their 
passages face down and immediately recorded 
all they could recall on prepared answer 
sheets. The Ss then engaged in five minutes 
of interpolated activity, followed by another 
period of memorization (study-time intervals 
and levels of reading difficulty were random- 
ized among and within training sessions), and 
so on. 

Experimental groups. When training pro- 
cedures were completed, the Ss were divided 
into five matched groups on the basis of their 
scores on the following: ACE Quantitative, 
artificial language—low organization, artificial 
language—high organization, and English. 
The experimental groups were now adminis- 
tered the 30 experimental English passages, 
each group taking each passage for only one 
of the five study-time intervals. Both study 
times and levels of reading difficulty were 
randomized among groups and within experi- 
mental sessions. This procedure was carried 
out over a period of one week, with six pas- 
sages administered per daily session. 

Scoring. Two criteria had to be satisfied 
for a word to be considered correctly learned: 
First, the word had to be reproduced unam- 
biguously, as it appeared in the passage. For 
example, houses was not accepted as correct 
if the word in the passage was house. Second, 
the word had to be reproduced in the correct 
serial position. Considerable latitude was al- 
lowed here. A word was considered to be re- 
produced in the correct position if it occurred 
in the proper serial position either with re- 


IS 


f a Eda 28 Research 
al NG COLLEGE 


sameness 


30 


spect to the beginning of the passage or to 
the beginning of the appropriate sentence. In 
the latter instance, however, the word in ques- 
tion had to be part of a sequence of at least 
two words which was correctly located within 
the sentence. 

The mean amount learned per minute of 
study time was taken as the learning score for 
each passage. This was obtained by sum- 
ming the mean amounts learned in the five 
study-time intervals and dividing by 10.5, the 
sum of the study times. The mean and 
standard deviation of the distribution of 
learning scores are shown in Table 1. 


Prediction 


Subjects. Thirty-seven college students 
drawn from the same population as those who 
participated in the learning experiment (but 
excluding those who actually did participate) 
were selected to take part in a prediction ex- 
periment employing the same 30 English pas- 
sages. These students were selected to match 
the Ss used in the learning experiment on the 
basis of age, sex, level of education, and their 
Scores on the ACE Total, L, and Q. 

Procedure. The Ss were given the first 
word of a passage and instructed to guess the 
next word. After they wrote their guesses in 
black pencil, they were told the correct word, 


ie., the one actually occurring in the pas- 


sage. They were then instructed to write this 


word in red pencil on the line above the word 
they had Suessed, even if they had guessed 
correctly. Now they were told to guess the 
next word, given the Correct word, told to 
record it above their guess, and so on. The 
Ss were repeatedly instructed to read through 
all of the Passage covered thus far (which they 


had written in red above their guesses) þe- 
fore guessing the next word. 
the administrato 


f reading diffic lt; 

randomized among sessions, i n 
Since it would be fairly tempting for Ss to 
alter their guesses during the testing pro- 


Herbert. Rubenstein and Murray Aborn 


cedure, measures were adopted to preveni 
cheating. The Ss met in small groups ee 
from four to ten persons, overseen by an a 
ministrator and a proctor. -Furthermore, eac! 
answer sheet had a carbon paper and ancy 
sheet stapled under it, so that any ere 
would appear as a smudge on the bottom 
sheet. ; 
Scoring. For a guess to be counted cone 
the S had to write the word in exactly the 
same form as it appeared in the pase 
Misspellings were considered correct only 
when they were unambiguous. a 
The mean number of correct prediction 
per word, computed from the guesses on wor 
2-67 in each passage, was taken as the pa 
sage prediction score. The 67th word had 1 
be taken as the limit since the prediction e 
periment was carried out only up to the max! 
mum amount learned in each passage E. 
there was one passage in which no S ie 
able to recall any word beyond the 67th. T F 
mean and standard deviation of the distribU 
tion of prediction scores are shown in Table 1 


Results and Discussion 


The coefficients of correlation show? ai 
Table 2 are all significant beyond the the 
level and may be taken as estimates of y 
degree of interrelationship between learnin 
predictability, and readability as measured e 
the techniques employed in this study. Da 
Chall scores appear to correlate more hig” 
than Flesch scores with both learning A 
prediction, though only in the case of z 
correlations with learning is the differe”? 
between the two formulas statistically sig” 


Table 2 


Intercorrelation of the Variables 


a 
Dale-Chall Amount Prion 
Readability Learned dict 


Flesch Readability O18 61 ‘4 
Dale-Chall Readability oF 3 
Amount Learned 


"Si 


scots 
ity s 8 F se to the ati 
derived from the other three ieasures are inverse ‘ 
rel i 


con 
ve sign has been removed: 


Learning, Prediction, and Readability 31 


ant (p < .02)5 Klare (4) similarly pre- 
* “sents evidence that the Dale-Chall is superior 
I to the Flesch formula in rating reading-test 
passages, though the differences he obtained 
are statistically nonsignificant. If an ample 
relationship between ease of learning and ease 
of comprehension were demonstrable, meas- 
ures of learning might provide more reliable 
criteria for the effectiveness of readability rat- 
ings than tests of comprehension, often diffi- 
cult to scale and control. 

It is interesting to observe that the correla- 
tion between amount learned and prediction 
does not differ significantly from the correla- 
tion between.amount learned and the Dale- 
Chall. One would expect that prediction 
would correlate more closely with learning 


than either readability formula since predic- - 


tion and learning both involve the factor of 
contextual constraint. Most likely, the meas- 
ure of predictability employed in this study 
was not as sensitive as it might be. Rela- 
tively few words could be successfully pre- 
dicted on the basis of one trial and from a 
knowledge of the preceding context alone. 
Other measures of predictability, particularly 
those based on a knowledge of the context on 
both sides of the word or based upon some 
index of intersubject agreement would prob- 
ably show a higher correlation between these 
two variables than is here indicated. 

This method of prediction may also be re- 
sponsible for differences in the magnitude of 
the relationship between readability and pre- 
dictability obtained here and obtained by 
Taylor (8). Taylor reports a maximum Cor- 
relation of .94 between readability rankings 
assigned by Cloze procedure and by the Dale- 
Chall, and a maximum correlation of .71 be- 
tween rankings assigned by Cloze procedure 
and by the Flesch formula. Of course, these 
coefficients are based on data obtained from 
only six passages; nonetheless, it is quite pos- 
sible that these higher correlations result from 
a more sensitive measure of predictability. 

Despite the differences between the two 


ö Since this and all further comparisons involved 
sets of bivariates having one array 1m cone 
Hotelling’s test for differences between correlates cor 
efficients was employed. All statements of am - 
cance are based upon p values for the distribution 
of £ at 27 degrees of freedom. 


readability formulas in the degree to which 
they correlate with learning and prediction, 
they exhibit the same high degree of intercor- 
relation reported by other investigators (4, 8). 
They seem to be measuring the same factors 
—probably grammatical complexity (through 
the measure of sentence length) and vocabu- 
lary level. Judging from the apparent su- 
periority of the Dale-Chall when both for- 
mulas are tested against outside criteria, how- 
ever, it seems that the number of unfamiliar 
words in a passage of English gives a better 
estimate of vocabulary level than word length 
in syllables. 


Summary and Conclusions 


Subjects previously given intensive practice 
in memorizing English passages of a wide 
range of reading difficulty were assigned the 
task of learning as much as they could of 30 
experimental English passages in set periods 
of study. Another group of Ss, selected to 
match the learning group, went through these 
same 30 passages, predicting each successive 
word (only one guess allowed) from a knowl- 
edge of all the preceding context. Readabil- 
ity scores were calculated for each passage 
according to the Flesch and Dale-Chall for- 
mulas. Product-moment correlations were 
computed between the mean amounts learned, 
the mean number of correct predictions per 
word, the Flesch, and the Dale-Chall read- 
ability scores. Intercorrelations among the 
variables showed that: 

1, Learning, prediction, and readability are 
closely interrelated. 

2. Prediction and readability correlate about 
equally well with learning. However, the 
method of prediction employed in this study 
involved a single guess based upon a knowl- 
edge of the preceding context alone, and it 
may be that such a method yields a relatively 
insensitive measure of predictability. 

3. The Dale-Chall formula correlates sig- 
nificantly more closely than the Flesch for- 
mula with learning, and somewhat more 
closely—though not significantly so—with 
prediction. Whether this is to be taken as 
a demonstration of the superiority of the 


32 


Dale-Chall formula, however, depends upon 
the degree of relationship between learning 
and comprehension. It is suggested that if 
ease of learning is shown to be sufficiently 
related to ease of comprehension, amount 
learned might provide a better criterion of 
readability than the more difficult to control 
tests of comprehensibility. 

4. Despite observed differences between the 
two readability formulas, they were found to 
correlate very highly with each other, thus 
supporting the notion advanced by other in- 
vestigators to the effect that the two formulas 
measure substantially the same things. 

5. Both readability formulas showed a 
higher correlation with learning than with 
prediction, but the differences were not sta- 
tistically significant. 


Received April 3, 1957. 


w 


n 


on 


Herbert Rubenstein and Murray Aborn 


References 


. Aborn, M., & Rubenstein, H. Information theory 
and immediate recall. J. exp. Psychol., 1952, 
44, 260-266. tt 

- Dale, E., & Chall, J. S. A formula for predicting 
readability. Educ. Res. Bull., Ohio State Uni- 
versity, 1948, 27, 11-20 and 37-54. 

. Flesch, R. F. A new readability yardstick. J. 
appl. Psychol., 1948, 32, 221-233. k 

. Klare, G. R. Measures of the readability of writ- 
ten communication: an evaluation. J. edn. 
Psychol., 1952, 43, 385-399, 

- Miller, G. A., & Selfridge, J. A. Verbal context 
and the recall of meaningful material. Amer. 
J. Psychol., 1950, 63, 176-185. 

- Reed, H. B. Repetition and association in learn- 
ing. Ped. Sem., 1924, 31, 147-155. l 

- Rubenstein, H., & Aborn, M. Immediate reca 
as a function of degree of organization an 
length of study period. J. exp. Psychol, 195% 
48, 146-152. | 

- Taylor, W. L. “Cloze procedure”: a new t0° 
for measuring readability. Journalism Quarly 
1953, 30, 415-433. 


ournal of Applied Psychology 
‘ol. 42, No. 1, 1958 


Relationship Between Stated and Measured Interests of 
Two Groups of United States Air Force Officers?! A 


Paul G. Jenson °? 


Macalester College, St. Paul, Minnesota 


In a study conducted by the staff of the In- 
dustrial Relations Center at the University 
of Minnesota for the United States Air Force, 
Strong Vocational Interest Blanks (SVIB) 
and personal history questionnaires were com- 
pleted by Air Force Officers in the person- 
nel and accountant-comptroller areas. On the 
personal history questionnaires the officers in- 
dicated their choice of civilian occupation. 
The relationship between this civilian choice 
of occupation (stated interests) and meas- 
ured vocational interests is the subject mat- 
ter of this report. 


Methods and Procedure 


Completed materials were obtained from Air Force 
Officers who had Air Force specialty code numbers 
in the personnel or accountant-comptroller areas. In- 
cluded in this study are returns from 1155 personnel 
officers and 243 accountant-comptrollers. These N's 
represent about an 84% return of the material which 
Was originally sent. 

Expert a nA were used in two different ppa 
Three judges? independently interpreted the ey 
Profiles using the Darley technique (1). Judemens 
of primary interest patterns were the basis itor e- 
termining measured vocational interests. a a 
mary interest pattern to exist a majority or plurality 
of scores in an occupational groups had to be A or 
B+ res. ; 

As peen are judged in terms of scores for oc- 
cupational groups, it was decided to judge rata 
cupational groups which contain a single AERE n 
along with the group of occupations with which i 


1 The original study of which this report is a part 
was EAT by the United States Air Hore i 
der contract no. AF 18(600) 337 and was mog oa 
by the Officer Personnel Division, Human 1 San ee 
Research Institute, Air Research and Deve oninen: 
Command, Maxwell Air Force Base, Alaban A oa 
opinions and conclusions expressed herein ae | A 
be construed as necessarily carrying the official sat 
tion of the Department of the Air Forse or o e 
Air Research and Development Command. ani 

2? This paper is a part of the author's Phi it e ie 
The author is indebted to his major advisor, bone 
G. Paterson, and to Marvin D. Dunneties i r 
W. England, Donald P. Hoyt, Thomas M. } lag 3 
and Harry Roadman for their generous aa 

The judges were Donald G. Paterson, Messin i 
Dunnette, George W. England, and Harry Roadman. 


was most closely related. Relatedness was deter- 
mined by Strong’s published intercorrelations (3) 
and by judgmental factors. Thus, Group III which 
consists solely of the Production Manager scale was 
included with Group IV, the technical group; Group 
VII which is the Certified Public Acountant scale 
was included with Group VIII, the business detail 
group; and Group XI, the President of Manufac- 
turing Concern scale, was included with Group IX, 
the sales group. Because there is little relationship 
between Group VI, the Musician scale, and other 
occupational groups it was decided to exclude that 
group from the analyses. 

One other modification of the SVIB profile was 
made. Because the major study which used these 
data was primarily concerned with interests of men 
in the personnel and accountant-comptroller areas, 
it was decided to include the Personnel Director and 
Public Administrator scales as a separate group from 
the rest of the occupations in the social service group. 

The names and numbers of the occupational groups 
on the SVIB which were used in this study are as 
follows: I. Biological Sciences; II. Physical Sciences; 
I-IV. Technical; Va. Personnel; V. Social Service ; 
VII-VIII. Business Detail; IX-XI. Business Con- 
tact; X. Verbal-Linguistic. 

In order to say that an interest pattern existed, 
at least two of the three judges had to indicate the 
presence of the pattern. The judging was consistent. 
There was 90% agreement between at least two out 
of three judges in the judging of primary patterns, 

Judges were also used in categorizing the civilian 
choice of occupation. The question on the personal 
history questionnaire which asked the officers to in- 
dicate their civilian choice of occupation was an 
open-ended question so there were many different 
types of responses. In order to make these data 
meaningful these occupations were categorized in 
terms of the occupational groups on the SVIB. 

Three qualified judges: determined the occupa- 
tional group on the SVIB to which a particular oc- 
cupation would most likely belong. “Belongingness” 
was in terms of vocational interests.5 Little diffi- 
culty was encountered when there was an occupa- 
tional scale on the SVIB for the occupation selected. 
However, difficulty was encountered with occupa- 
tions which seemingly represented a combination of 
interests and where the selected occupation was so 

‘These judges were Paterson, Donald P. Hoyt, 
and Thomas M. Magoon. 

* This type of judgment has fairly high validity in 
spite of the known ambiguity in occupational titles 
as shown by Strong (4, pp. 13 and 91 E). 


34 Paul G. Jenson 


vague and general as to preclude classification. _ Ex- 
amples of occupations which were difficult to judge 
and where there was little or no agreement were 
management consultant, managing a sports team, 
radio work, and aircraft manufacturing. 

Criterion established for including a selected oc- 
cupation in an occupational group was agreement 
between at least two out of the three judges. This 
judging was consistent. There was 78% agreement 
between at least two out of the three judges in as- 
signing the occupations selected by the officers to 
the various occupational groups on the SVIB. 


Results 


In Table 1 the number and percentage of 
officers with judged stated interests in the 
different occupational groups are shown. Also 
included are the number and percentage who 
have no stated interests and the number and 
percentage who have stated interests which 
were excluded from occupational groups due 
to disagreements among the judges as to the 
most appropriate occupational group. 

The judges disagreed more in assigning the 
stated interests of the accountant-comptrollers 
than of the personnel officers to occupational 
groups on the SVIB. About 22% of the ac- 
countant-comptrollers had stated interests in 
this category as compared with about eight 
per cent of the personnel officers. The ac- 
countant-comptrollers more frequently indi- 
cated choices in the business detail area and 


Table 1 


Judged Stated Interests of Personnel Officers and 
Accountant-Comptrollers 


Personnel Accountant- 
Officers Comptrollers 
Occupational Group N a N Cant 
Biological Sciences 12 1.0 2 9 
Physical Sciences 27 23 1 4 
Technical 141 122 18 7A 
Personnel 387 33.5 4 16 
Social Service 57 49 o $ 
Business Detail IS 13.4 122 50.2 
Business Contact 137 11.9 17 70 
Verbal-Linguistic 6153 8 33 
No Stated Interests 92 8.0 16 ; 
Disagreement Between : ‘ = 
the Judges 90 7.8 55 218 
Total 1,155 100.0 243 100.0 


Table 2 


Relationship Between Stated Interests and Measured 
Interests for the Personnel Officers 
(Chi square = 81.3, P < .001) 


Stated Interests Stated Interests 
and but not sts 
Measured Interests Measured Interests 
i i = r ~ per 
Occupational Per N I 
Group N Cent N Cent 
p pE 2.2 
Biological Sciences* 2 a 10 pi x 
Physical Sciences 8 1.6 19 E 
Technical 62 12.0 79 as 
Personnel 258 50.1 129 BF 
Social Service 23 4.5 34 bie 
Business Detail 70 13.6 8! we 
Business Contact 82 15.9 55 Ba 
Verbal-Linguistic 10 19 51 Me 
Total 515 100.0 458 100.0 
= a 
ey Ji ; chi 4 
*The biological science group was not included in the 


square because of the small NV, 


in being self-employed and these were amorf 
the occupations about which the judges m0% 
frequently disagreed. r5 
A plurality of both the personnel officen 
and the accountant-comptrollers had stalé 
interests in occupations which were similat %4 
their military occupations. About one-t “a 
of the personnel officers selected some asp% e 
of personnel work and about one-half of s5 
accountant-comptrollers selected the businé 
detail area. ded 
Because the accountant-comptrollers te” i 
to choose civilian occupations in just the pel 
ness detail area, the frequencies in the a re 
occupational groups are so small as tO > 
clude the use of extensive statistical ana vis 
Thus, Table 2 shows the agreement an a5 
agreement between stated interests and ™ 
ured interests just for the personnel © “pt! 
(Excluded from the table are 182 pets? 08? 
officers who had no stated interest or Wy 
Stated interest could not be placed 1” pt 
occupational group because of disagree™ 
among the judges.) i ob” 
The chi-square value of 81.3 which W4 jig 
tained for the data in Table 2 is high!Y at?! 
nificant. There was a significantly 8" ons 
agreement between stated interests and ™ uP 
ured interests for some occupational as is 
than for others, The highest agreeme? i ihe 
the personnel area. About one-half ° aif 
Personnel officers (50.1%) have stale 


d 
l 


Stated and Measured Interests 3 


measured interests in the personnel area. 
There is a big drop in the number of per- 
sonnel officers who have agreement between 
stated and measured interests to the next 
highest groups which are the business con- 
tact, business detail, and technical groups. 
From these groups there is another drop to 
the social service group and then again to the 
verbal-linguistic, physical science, and bio- 
logical science groups. 

The highest percentage of personnel offi- 
cers whose stated interests do not agree with 
their measured interests is also in the per- 
sonnel area. ‘Twenty-eight per cent of offi- 
cers who do not have agreement between 
stated and measured interests are in the per- 
sonnel area. This percentage is less than the 
percentage of officers who have both stated 
and measured interests in the personnel area. 
The fact that there is such a concentration of 
officers in the personnel area even when there 
is disagreement between stated and measured 
interests is not surprising in view of the fact 
that about one-third of all the personnel offi- 
cers selected personnel work as their civilian 
choice of occupation. Of this one-third about 
two-thirds also had measured interests in this 
area, 


Summary 


1. There was good agreement in judging 
primary interest patterns on the Strong Vo- 
cational Interest Blank for this sample of Air 
Force Officers in the personnel and account- 
ant-comptroller areas. In judging primary in- 
terest patterns the agreement between at least 
two of three judges was 907%. me 

2. There was good agreement in assigning 
the civilian occupations selected by the Air 


on 


Force Officers to the appropriate occupational 
groups on the Strong Vocational Interest 
Blank. The agreement between at least two 
of the three judges was 78%. 

3. The Air Force Officers in this study 
tended to select civilian occupations which 
were similar to their military occupations. 
About one-third of the personnel officers had 
stated interests in the personnel area and 
about one-half of the accountant-comptrollers 
had stated interests in the business detail 
area. 

4. In some occupational areas there was 
greater agreement between stated and meas- 
ured interests than in other occupational 
areas. For the personnel officers the best 
agreement was in the personnel area and for 
the accountant-comptrollers it was in the 
business detail area. However, no statistical 
analyses were made of these data for the ac- 
countant-comptrollers because of the extreme 
concentration of both stated and measured 
interests in this one area to the exclusion of 
all the other occupational areas. 


Received April 3, 1957. 


References 


1, Darley, J. G., & Hagenah, Theda. Vocational in- 
terest measurement, Minneapolis: Univer. of 
Minn. Press, 1955. 

2. Jenson, P. G. A normative study of the Strong 
Vocational Interest Blank for male adult 
workers. Unpublished doctoral dissertation, 
Univer. of Minn., 1955. 

3. Strong, E. K., Jr. Manual for Vocational Inter- 
est Blank for men. Stanford: Stanford Uni- 
ver. Press, 1951. 

4. Strong, E. K., Jr. 
after college. 
Press, 1955. 


Vocational interesis 18 years 
Minneapolis: Univer. of Minn. 


al of Applied Psychology 
Vara, No. 1, 1958 


An Evaluation of Two Attitudinal Approaches to Delegation * 


Allen R. Solem 
University of Maryland 


Current problem-solving procedures in busi- 
ness and industry indicate that there are 
many different points of view concerning the 
supervisory function of delegation (1, 2, 3, 4, 
5, 6, 7, 9, 10, 13, 14, 16, 17, 18, 19). AL 
though there appear to be relatively few prob- 
lems in delegating the execution of decisions, 
there is a wide range of opinion concerning 
the degree to which it is advisable to share 
the decision-making function itself. Since not 
all decisions are properly subject to delega- 
tion the differences can be attributed in part 
at least to the types of problems involved. A 
more important factor, however, seems to be 
the superior’s frame of reference toward his 
job and his subordinates. Some superiors 
prefer to decide things on their own with little 
or no prior consultation; others tend to seek 
the advice of staff experts or peers before de- 
ciding and still others frequently use con- 
sultative procedures for obtaining the views 
of subordinates as a basis for their decisions. 
Despite these variations in procedure a com- 
mon factor in most approaches is that the 
superior must retain the authority to modify 
or reject ideas or decisions which do not meet 
with his approval. 

In contrast to this frame of reference is the 
attitude of placing final responsibility for cer- 
tain decisions and for the end results in one’s 
subordinates (8, 11, 13, 14). Such an atti- 
tude would imply that the superior reserves 
the right to decide what are the decisions he 
must make himself and what are those to be 
delegated. However, once the responsibility 
for making a decision or developing a solu- 
tion has been placed in one’s subordinates, 
the assumption is that the superior will ac-. 
cept and support the action regardless of 
whether he personally agrees with it or not. 


36 


This means that subordinates are held ac- 
countable for results, not for developing solu- 
tions designed to obtain the approval of the 
superior. ; i 

The difference between these two views 0 
delegation raises a number of relevant ques- 
tions to problem-solving in management, 1- 
cluding: (a) What influence, if any, does the 
delegation approach that is used have 0? 
solution quality? (6) Is there any difference 
between the two approaches as to the accept- 
ance of the decisions by those who must carry 
them out? (c) What implications are there 
in the two procedures for the development o 
problem-solving and managerial abilities ° 
subordinates? (d) To what extent do thé 
differences between the two approaches rê- 
flect attitude differences as contrasted to su i 
attributes as knowledge and skill? (e) Wha 
guides, if any, do the differences betwee” 
these views of delegation indicate for manage” 
ment training and research? 


' Method 


Subjects. The Ss were 456 supervisors attending s 
foremen’s conference. They represented several Je 
els of management and many different industries. y 

Role-playing problems. Two different mann 
ment problems were used. Both problems have i4) 
Peared in other previous publications (12, 13, re 
and are merely summarized here. One problem i 5 
ferred to later as the New Truck Problem) conce" - 
the allocation of a new truck among the five Me 
bers of a crew of repairmen, all of whom want e 
truck. This creates an attitude conflict among -p 
members which must be resolved before a soluts 
can be reached. The other problem (designated e5 
the Change of Work Procedure Problem) inv) 
= Crew of three men on a routine assembly oP® 0 
tion who rotate Positions Periodically in order 
Prevent boredom. Meanwhile a methods study pe 
revealed that if each man Were to remain ON “| 
Position for which he is best suited there will Þe e 


time per unit, However, “|. 


dure 


jng 
Multiple Role Play, 
mental procedure. for 
ms were held (one 


(15) was used as the experi 
Separate experimental sessio; 


Attitudinal Approaches to Delegation 37 
Table 1 
Solutions to Problems Developed by Supervisors Under Conditions of Limited Delegation (LD) 
and Full Delegation (FD) 
(N = 456) 
New Truck Problem 
Solutions Change of Work Procedure Both Problems 
Assign- Gi Problem . Combined 
ment Varying 
of New Majorities aro Dis- 
Truck to Different Estimating N satisfied Condi- 
No. of Senior Trucks No.of Problem Production Satisfied _ Group tional 
Groups Man (5,4, & 3) Groups Persons Increase Leaders Members Solutions 
LD groups 19 10.5 21.1 20 20.0 71.2 84.6 19.7 23.1 
FD groups 27 48.2 55.5 25 10.7 92.0 98.1 8.1 40.4 
x? (computed from 
frequencies) 7.18 5.48 2.30 13.42 6.60 10.08 3.02 
Level of significance (.01-.001) (.01-.02) (.10-.20) (.001-.0001) (.01-.02) (.01-.001) (.05-.10) 


cach problem) and the Ss in the two sessions were 
different so that practice effects were minimized. 


Problem 1 


New truck problem. Following a lecture on the 
subject of attitudes, the Ss were formed into labora- 
tory groups of approximately 40 individuals. These 
groups then met in separate rooms under the leader- 
ship of trained experimenters. When each group 
had assembled the Ss were informed that they were 
to participate in a discussion of a management prob- 
lem involving a foreman and his crew of 5 repair- 
men, and since they were going to be these men in 
the discussion, the Ss were asked to form into groups 
of six. A brief discussion was held as to the nature 
of role-playing procedures. Following this was a 
presentation of certain essential background infor- 
mation on the problem. The experimenter then gave 
one person at random in each group of six a set of 
roles (this person was thus designated as the leader 
of his group) with the instruction to retain the 
leader role (including the problem) and distribute 
the remaining five roles among the members in his 
crew. The sets of roles were the same for all crews; 
however, the individual member roles were different. 
Half of the leaders were given a written attitude in- 
struction toward deciding what would be the fairest 
solution to the problem of allocating the new truck 
and then discussing the solution with their crews. 
The remaining half of the leaders were given a writ- 
ten attitude instruction toward presenting the prob- 
lem to the crew for discussion and accepting what- 
ever solution was developed. After 25 minutes of 
interaction, all discussions were ended and the ex- 
perimenter then proceeded with the collection of the 
data and a general discussion of the results. l 

Data collection procedure. Data on the following 
aspects of the solutions were obtained from all 
groups, one group at a time. 


1. Who got the new truck? i 
2. What disposition was made of that person’s old 
truck [until all trucks had been accounted for]? 


3. Are there any other aspects of the solution not 
already covered? 

4. Is the leader satisfied or dissatisfied? 

5. Which crew members are dissatisfied? 


Problem 2 


Change of work procedure. The sequence of steps 
in the experimental procedure was the same as for 
the first problem. However, the experimental period 
was preceded with a lecture on frustration princi- 
ples. In the laboratory session itself, the Ss were 
asked to form groups of four persons, as called for 
by the problem. Also the questions used in the col- 
lection of the data were different and consisted of 
the following: 


1. What is the solution in your group [asked of 
the leader]? 

2. Which of your crew members, if any, showed 
stubborn, hostile, or uncooperative reactions so 
as to create a problem in the discussion? 

3. What will happen to production if the solution 
you have settled on is put into effect [asked of 
all participants with separate tabulations for 
“increase,” “decrease,” and “stay about the 
same”]? 

4. Are you [the leader] satisfied or dissatisfied 
with the solution? 

5. Which crew members [if any] are dissatisfied? 


Results 


The results are shown in Table 1. The 
data which are unique to each problem are 
shown in Columns 2 through 7 and those 
which are common to both problems have 
been combined in Columns 8, 9, and 10. 

Under limited delegation (LD) the senior 
man had about one chance in ten of getting 
the new truck. However, under full delega- 
tion (FD) the new truck was assigned to the 
senior man in nearly half of the solutions, 


38 Allen R. Solem 


Thus it seems that the values in seniority as 
a basis for assigning the new truck tended to 
mean different things under the two delega- 
tion conditions, 

When one crew member receives the new 
truck, then his previous one must be assigned 
to a different crew member or be disposed of. 
This means that several exchanges of vehicles 
may occur. Since such exchanges are volun- 
tary they will occur when both parties feel 
they will gain in some way. The number of 
exchanges therefore may be taken as a meas- 
ure of solution quality. Viewed in this light, 
the fact that three or more of the five crew 
members received different trucks about 24 
times as often under FD as under LD sug- 
gests that the superiors were less likely to see 
the possibility of rewarding several individu- 
als than were the subordinates. A similar 
tendency is indicated with respect to the con- 
ditional solutions in Column 10. These are 
solutions which contain unique features or 
extras designed to satisfy particular needs of 
subordinates, and such solutions tended to 
occur more frequently under the FD condi- 
tion. Further, an inspection of the raw data 
reveals an interesting qualitative difference 
in that the conditions developed under LD 
tend to be in the nature of Concessions ex- 
acted from the superior, and under FD the 
conditions are in the nature of constructive 
improvements to the solution, 

In Column 7 a somewhat different measure 
of solution quality is indicated for the Change 
of Work Procedure Problem. These data rep- 
resent the views of superiors and subordinates 
as to whether the adoption of the new solu- 
tion will result in an increase in production 
vs. a decrease or no change. While it seems 
probable that feeling judgments of acceptance 
as well as intellectual evaluations of solution 
quality are both represented, the difference in 
Proportion of those predicting a production 


increase to occur is significantly in favor of 
the FD condition. 


terest to note the 


than was true under FD, In other words the 
superiors using the LD procedure apparently 
had a less pleasant experience in conducting 
their discussions than was true of those using 
the FD approach, yet were unable to improve 
things appreciably once the discussions got 
under way. Further evidence of a related na- 
ture is indicated in Columns 8 and 9 which 
shows that there were significantly fewer 
satisfied leaders under the LD condition and 
a significantly greater proportion of dissatis- 
fied subordinates than under FD. 


Discussion 


The results indicate that a superior who re- 
serves to himself the authority to make final 
decisions may not always expect as satisfac- 
tory results as when full responsibility for 
solving certain problems is delegated to one’s 
subordinates. Regardless of how perceptive 
and fair-minded the superior may try to be, 
it appears that he may often tend to misjudge 
the importance of group values and to over- 
look various Opportunities for rewarding his 
Subordinate group members. Further, the in- 
dications are that the LD approach as com- 
pared to FD is more likely to generate hos- 
tility and dissatisfactioin among subordinates 
and result in a less satisfactory problem-solv- 
ing experience for the superior himself. 

In part at least, the differences in results 
appear to arise from the fact that the LD 
Procedure causes the superior to take an ini- 
tial position as to wha 
so that, in reality 


to the group, not a Problem. To the degree 
that the soluti 


and ideas of 
focal point fo 
tion and criti 
hand tends t 
tive thinking 
tions. Hence, by Presenting his own views aS 
a solution the 
ending a given set of 
aiding his subordinates in 


8 a decision in the light 
nformation, this is not 
a superior feels that such 
preted as a sign of weak- 


of important new j 
likely to occur when 
action may be inter 
ness or indecision, 


àa y 


Attitudinal Approaches to Delegation 39 


From this it appears that an important 
Contribution of the full delegation attitude of 
the superior is that it influences subordinates 
toward constructive solution of a problem on 
its own merits. In so doing, it helps to avoid 
any tendencies toward merely giving lip serv- 
Ice to a superior’s solution, of arguing with 
him, or of doing as directed with reduced 
Motivation. 

The results from this one experiment yield 
only Partial answers to some of the questions 
raised earlier in this report and even these 
answers must be viewed with reservations. 
F Or one thing the previous supervisory ex- 
Periences of the subjects may have caused 

em to react differently to the experimen- 
tal situation from other nonsupervisory em- 
Ployees in industry. In addition, the delega- 
tion Conditions tested may not be representa- 
os of more than a very limited segment of 

anagerial situations. Further studies for the 
exploration of these and other related issues 
ate now in progress. 


Summary 
st This experiment was concerned with the 
Udy of attitudinal influences on the dele- 
Sete process. Management personnel were 
ied into groups of four and six members 
t ee Purpose of solving two different but 
ni industrial problems. In each group 
pla, member was selected at random to role 
tr aY the part of the superior and the other 
lembers took the part of his subordinates. 
a bate of the groups the superior was given 
ae Cee cue toward arriving at a decision 
na then discussing things with his subordi- 
‘es, thus limiting the delegation of prob- 
iv Solving. The remaining superiors yee 
Dea an attitude cue toward presenting : e 
tion em to their subordinates for their solu- 
Mad, and accepting whatever decision w 
temna this being termed full delegation. n 
satip, CL Solution quality, acceptance and 
ìsfaction of superiors and subordinates, 
viela me delegation procedure consistentiy 
Seve ed the more satisfactory results; ee 
2% i differences being significant beyon bs 
erpr evel of confidence. The results are a 
visi eted to mean that attitudes of supe 
On toward the delegation process may be 
Mportant factor in the solution of certain 
~ agement problems. 


ece; 
eiveq April 4, 1957. 


References 


1. Barnard, C. I. The functions of the executive. 
Cambridge: Harvard Univer. Press, 1938. 

2. Barnard, C. I. The nature of leadership. In 
S. D. Hoslett (Ed.), Human Factors in Man- 
agement. Parkville, Mo.: Park Coll. Press, 
1946. 

3. Browne, C. G. Study of executive leadership in 
business: The R, A, and D Scales. J. appl. 
Psychol., 1949, 33, 521-526. 

4. Ginzberg, E. What makes an executive? New 
York: Columbia Univer. Press, 1955. 

5. Hemphill, J. K. Leader behavior description. 
Personnel Research Board Monograph, Ohio 
State Univer., 1950, 

6. Johnson, R. W. Human relations in modern 


business. In E. C. Bursk (Ed.), Human rela- 
tions jor management. New York: Harper, 
1956. 


7. Katz, D., Maccoby, N., & Morse, N. C. Produc- 
tivity, supervision and morale in an ofice 
situation. Ann Arbor: Univer. of Michigan, 
1950. 

8. Lewin, K. Group decision and social change. 
In T. M. Newcomb & E. I. Hartley (Eds.), 
Readings in social psychology. New York: 
Henry Holt, 1947. 

9. McCormick, C. P. Multiple management. New 
York: Harper, 1938. 

10. McGregor, D. The conditions of effective lead- 
ership. In S. D. Hoslett (Ed.), Human fac- 
tors in management. Parkville, Mo.: Park 
Coll. Press, 1946. 

11. Maier, N. R. F. A human relations program for 
supervision. Indus. & Lab. Rel. Rev, 1, 3, 
1948. Pp. 443-464. 

12. Maier, N. R. F. Principles of human relations; 


applications to management. New York: 
Wiley, 1952. 
13. Maier, N. R. F. Psychology in industry. (2nd 


ed.) Boston: Houghton Mifflin, 1955. 

14. Maier, N. R. F., Solem, A. R, & Maier, A. 
Supervisory and executive development; a 
manual for role playing. New York: Wiley, 
1957. 

5. Maier, N. R. F., & Zerfoss, L. F. MRP: a tech- 
nique for training large groups of supervisors 
and its potential use in social research. Hu- 
man Relat., 1952, 5, 177-186. 

16. Simon, H. A. Administrative behavior: a study 
of decision making processes in administrative 
organization. New York: Macmillan, 1951, 

17. Spriegel, W. R. Schulz, E., & Spriegel, W. B. 
Elements of supervision. (2nd ed.) New 
York: Wiley, 1957. 

18. Tead, O. The development of leadership power, 
In S. D. Hoslett (Ed.), Human factors in 
management. Parkville, Mo.: Park Coll. Press, 

1946. 

19. Thelen, H. Dynamics of groups at work. Chi- 
cago: Univer. Chicago Press, 1954. 


Journal of Applicd 


Psychology 
Vol. 42, No. 1, 1958 


The Effects of Sound Films on Opinions About Mental 
Illness in Community Discussion Groups * 


Elliott McGinnies, Robert Lana, and Clagett Smith * 


University of Maryland 


Research concerning the effects of mass 
communications on attitudes and opinions has 
generated a rather perplexing set of results. 
Despite common belief that the mass media 
exert a profound influence upon the manners 
and morals of recipients, evidence to this ef- 
fect is still fragmentary and controversial. 
Early investigators in the area of motion pic- 
ture films, for example, have reported not only 
immediate but persisting attitude changes fol- 
lowing exposure to a single film. The findings 
of Peterson and Thurstone (9) who used films 
to induce changes in the attitudes of children 
on such topics as nationality, crime, and war 
are typical of results reported in this area. 
These investigators further reported that a se- 
ries of films was sometimes successful in in- 
ducing attitude change where a single film 
had failed. Hoban and van Ormer (4), on 
the other hand, have examined studies done 
with Army training films and educational films 
and have concluded that a single communica- 
tion produces only a temporary effect upon 
attitudes, if any. Fearing (2), who expresses 
Scepticism with respect to the usefulness of 
films for this purpose, states that “. . . re- 
search, conducted as carefully as we know 
how to conduct it, reveals that the effects of 
these media—films and radio, especially films 
—on human attitudes and behavior is unex- 
pectedly slight.” Factual knowledge, how- 
ever, as many studies have shown, can effec- 
tively be imparted by the use of instructional 
films (4). 

Evidence has also been Presented to show 
that group discussion facilitates learning, atti- 
tude change, and readiness to make a decision 


s Public 
Health Service. Richard Bell and Hyman Goldstein 


of NIMH took an active part in the initial planning 
Joseph Bobbitt of NIMH has 

been helpful at all stages of the study. 
? Now in the Graduate Department 


of Social Psy- 
chology at the University of Michigan. 


40 


with respect to communicated material. Hov- 
land, Janis, and Kelley (5) suggest the pos- 
sibility that acceptance as well as learning 
may be affected by eliciting overt verbali- 
zations about a persuasive communication. 
They state, “When an individual verbalizes an 
idea to others he becomes more inclined to 
accept it himself.” Bennett (1) has con- 
cluded that the function of a group discussion 
is to “facilitate decision and/or the percep- 
tion of consensus. ...” Somewhat more 
tangential evidence for the efficacy of group 
discussion under these conditions comes from 
a study by Timmons (11), who found that 
individuals allowed to discuss a problem in 4 
small group obtained solutions superior to 
those of persons who did not discuss the ma- 
terial. k 

In light of the possibility that motion pic- 
ture films may under some circumstances be 
effective persuasive devices, the present study 
was undertaken for the purpose of evaluating 
the effects of one or more mental health films 
in adult community groups. A questionnaire 
covering opinions and beliefs with respect to, 
mental illness was given to groups before am 
after exposure to a single film or to a series 
of films. In order to determine whether aa 
tive participation would facilitate any effects 
of the films, discussions were held in half ° 
the groups following film presentations ana 
prior to the second administration of the op!” 
ion inventory. ss 

The hypotheses examined in these expert 
ments were as follows: 

1. A single mental health film presented t° 
an audience without discussion will signifi 
cantly influence Opinions about mental ill- 
ness. at 

2. Group discussion of a single film bie 
facilitate opinion change as compared wit” 
the nondiscussion situation. 

3. A series of three mental health films 
Presented without discussion will result i? 


Effects of Sound Films on Opinions 41 


8reater opinion change than that generated 
by a single film under the same conditions. 

3 4. Group discussion following each of a se- 
ries of three films will bring about greater 
opinion change than under nondiscussion con- 
ditions. 


Experiment One 
Method 


Subjects. Six small groups totaling 76 individuals 
Were formed from larger P. T. A. and child-study 
groups in Prince Georges County, Maryland. Four 
Of these groups, varying in size from 11 to 18 mem- 
bers, were shown a series of mental health films. 
Two additional groups containing nine members each 
Served as controls. In general, the group members 
Were drawn from the upper middle class segment of 
the population and were fairly homogeneous with 
respect to age and education. Several of the groups 
contained both men and women, while the remainder 
were composed entirely of women. Since there is no 
pence to indicate that sex is systematically related 
© susceptibility to a persuasive communication, no 
attempt was made to balance this factor over all of 

© groups. The mean age of the Ss was 38.8 years, 
mais enjoyed on the average 2.8 years of college edu- 
63% their mean family income was $7,800, and 

2, identified their occupation as “housewife.” On 
à nine-point self-rating scale of “familiarity with 
pental health problems and concepts” they assigned 

‘mselves a mean scale value of 4.5. 
neifaterials The films selected for study were The 

celing of Rejection, The Feeling of Hostility, and 
ep a down, The first two deal with the etiology 
cee sonality disturbance, while the third is con- 
or ee with institutional treatment of psychotic dis- 
tion, T All three films were produced by the Na- 
me ‘al Film Board of Canada and are widely used in 

ntal health education programs. 
ena] order to construct an instrument thatywould 
with © Us to assess the opinions and beliefs of the F 
first respect to various aspects of mental illness, we 
rived pembled a pool of 112 relevant statements de- 
Me; from several sources (7, 8, 10, 12). The state- 
nts dealt in general with the etiology, perception 
Prognosis, treatment, and post-treatment percep- 
of mental illness. A questionnaire consisting of 
in items was initially administered to 157 sudo 
Mah 7 chology and sociology at the University o 
Pro Papa, Item analysis employing Flanagan’s aA 
espe ation method (3) justified reduction of os 
Drog omnaire to 72 statements. Repetition of t i 
7 goure with adult groups yielded & final list o 
Ples T _Test-retest reliability of this ae 
Wag of University students and adult P. T. A. grouns 
183 {bout 86. A split-half reliability coefficient. 
the “8S obtained from an independent sample ae 
State Pearman-Brown formula. Responses tOo US 
Tan, ments were made on a five-point rating sca e 
1g from “strongly agree” to “strongly disagree, 
the method of summated ratings was used to 


tion 
these 


obtain individual scores. Following are examples of 
the types of items included on the form: 

1. It is better not to discuss a mental illness as one 
would a physical illness. 

2. Few of the people who seek psychiatric help 
need the treatment. 

3. An employer should avoid hiring someone who 
has been in a mental hospital. 

4. Nervous breakdowns are due to overwork. 

Scoring of the items was based upon the responses 
of 12 staff members and graduate trainees at the 
University of Maryland Counseling Center. A high 
score was obtained by S if his responses were in the 
same direction as those expressed by these “experts.” 
A low score indicated disagreement with this pro- 
fessional opinion. As nearly as could be determined, 
the opinions and beliefs of our panel of professionals 
coincided with the general points made in the films. 
It should be noted that we do not refer to the meas- 
uring instrument as an “attitude scale,” since we 
have little evidence that the assumptions underlying 
a true scale have been met. Experience with the 
questionnaire, however, has indicated that it does 
measure reliably certain beliefs and opinions that 
people hold with respect to various aspects of men- 
tal illness. We shall refer to the questionnaire as the 
“Mental Health Opinion Inventory.” P 

The range of possible scores on the inventory was 
47-235. If S checked the middle, or indeterminate, 
position for each statement he would achieve a total 
score of 141, indicating neither agreement nor dis- 
agreement with professional opinion. A total score 
on all items of 188 would indicate agreement but not 
strong agreement with expert opinion. Since the 
group mean pretest scores ranged from 163.5 to 
184.9, it is apparent that our Ss were initially some- 
what predisposed toward professional judgment on 
the scale items. Our experience, however, has been 
that in groups of the types studied it is exceedingly 
difficult to discover opinions about mental health 
issues that are markedly naive or inaccurate. The 
individuals who would be most likely to score low 
on questionnaires of this type are precisely those 
persons who do not participate in community ac- 
tivities designed for educational purposes. In the 
present instance, the goal of the communicator is 
limited to overcoming certain misconceptions that 
exist among groups of interested persons who other- 
wise are fairly well-informed about mental health 
problems. 

A biographical questionnaire was also given to all 
Ss in the experiments so that we could determine 
whether the groups were comparable with respect to 
a number of socioeconomic criteria. As indicated 
earlier, no serious discrepancies appeared among the 
several groups in this respect. 

Procedure. Two of the groups viewed the three 
films at bi-weekly intervals. Each film presentation 
was followed by a half-hour discussion of the film 
or of related topics. The same discussion leader, a 
professional psychologist, met all of the groups in 
order to control any effects that the leader’s person- 


42 Elliott McGinnies, Robert Lana, and Clagett Smith 


ality might have upon the discussion process. In or- 
der to permit the fullest expression of individual 
opinion by the group members, a permissive or non- 
directive approach was taken by the discussion 
leader. At the outset of the meetings, the groups 
were informed that the discussions would be re- 
corded. No further mention was made of this, and 
most of the Ss later appeared oblivious to the fact 
that a tape-recording was being made. The record- 
ing apparatus was always operated from the rear of 
the room, and the microphones were strategically lo- 
cated before the Ss assembled. A fuller description 
of this procedure is reported elsewhere (6). The 
Mental Health Opinion Inventory was administered 
before the first film was shown and at the conclusion 
of the third discussion. 

The same procedure was followed for two addi- 
tional groups except that discussion of the films was 
omitted from the meetings. In order to control for 
expectancy of discussion and the possible effects of 
this upon perception of the films, these groups were 
told that they would discuss all three films at the 
conclusion of the final screening. They were allowed 
to do this only after they had completed the inven- 
tory for the second time, so that the discussion could 
have no effect upon the Measurement of opinion 
change, 

Two control groups simply responded twice to the 


inventory, with a four-week interval between ad- 
ministrations, 


Results 


: It had been predicted that opinions and be- 
liefs about mental illness, as reflected in re- 
sponses to the Mental Health Opinion Inven- 


tory would be altered as a result of exposure 
to the film series. It was also hypothesized 
that opinion change would be greater in those 
groups that had discussed the films as com- 
pared with the groups that were given no op- 
portunity for discussion. Discrepancies be- 
tween scores on the pre- and posttreatment 
administrations of the inventory were taken 
as measures of opinion change. Table 1 shows 
the mean scores for all of the groups before 
and after experimental treatment. Positive 
difference scores indicate movement in the di- 
rection of professional opinion on the ques- 
tionnaire. 

Before testing for the effects of treatment 
upon opinions and beliefs about mental ill- 
ness, the difference scores for the two groups 
within each treatment were examined for het- 
erogeneity of variance and for differences be- 
tween means. In all cases, the two groups 
representing each treatment did not differ sig- 
nificantly in either of these respects. The 
within-treatment groups, therefore, were com- 
bined in assessing over-all effects of experi- 
mental procedure upon opinion change. These 
treatment means are shown in Table 1. Itis 
apparent that both the film-alone and film- 
discussion groups were influenced by the ex- 
perimental conditions, while the control groups 


Table 1 
Pretest, Posttest, and Difference Scores for All Groups 
Pretest Posttest Group Treatment 
= em Mean Mean 
= Group i 7 M SD M SD Difference Difference 
Film-discussion I 163.5 18.6 E i96 l 
ore y 183.1 16.0 19.6 
Film-discussion IT 184.9 17.8 9 “ 
eee 198.0 14,2 13.1 
Film-alone T z 17. j 5 
piae 74.6 264 186.5 23.7 11.9 
173.1 19.5 191.9 10.7 18.8 = 
Control T 5 a 147 5 Pi 7 
ee 7 15.6 mny 17.8 —3.0 
Control If 169.8 7.8 7 2 
We 0 0: 17.8 173.7 19.1 3.9 


Effects of Sound Films on Opinions 43 


Table 2 


Analysis of Variance on Adjusted Posttest Scores 
(Covariance Method) 


_ Sumof Mean 
Source df Squares Square F P 
stag 2 3,5624 1781.2 17.10 <.001 
ithin 73 7,604.0 104.2 
a T 
atal 75 11,1664 


hee approximately the same scores on both 
Inistrations of the inventory. 

rey order to control for differences among 
aris toups in initial opinion, analysis of co- 

Was nce of the mean opinion change scores 

- are Smp loyed. The results of this analysis 

Jon Summarized in Table 2. Treatment ef- 
Pet Nats indicated in the table, were signifi- 
oa the 001 level. To determine the 

an ific sources of this between-treatment vari- 

Ce, three analysis of covariance ¢ tests were 

deerme This test involves weighting the 

the Minators of the conventional ¢ ratios by 

ple Coefficient of alienation of the entire sam- 

yi were of the three possible comparisons, two 
| t Significant. Both the film-discussion and 
€ film-alone groups differed significantly at 

ig level from the control groups. The 
Cusgi Comparison, that between the film-dis- 

si ion and the film-alone conditions, was not 
Bnificant at the .05 level. Participation 
ahan group discussion of the films did not 

g ss effects of the films so far as 

I Ses in opinion scores were concerned. 

Tes order to examine the possibility that a 

in ensue” effect might have been generated 

ot discussion groups, even though this was 

ari reflected in greater opinion change, the 

Soren a of pre- and posttreatment inventory 
miene were compared for the various experi- 

Vari al conditions. In no instances did the 

or S differ between groups, either before 

Nee is treatment, indicating that eid 
Series. Opinions following exposure to the ste 
Unde: Was no greater under discussion than 

T nondiscussion conditions. 

the g, Possibility now remained that one o 
Opinio. ms was responsible for the gbina 

exą n changes. It had been determined, for 
ple, that the groups tended to prefer the 


film Breakdown the’ most and The Feel- 
ing of Rejection the least. It was also con- 
ceivable that discussion had failed to sum- 
mate with the effects of the films because the 
accumulated impact of the film series pro- 
duced a maximum effect by itself. A further 
study, therefore, was designed to determine 
(a) whether one of the three films was re- 
sponsible for most of the measured effects 
upon opinion, and (b) whether group discus- 
sion would generate greater opinion change 
as compared with nondiscussion conditions 
when a single film rather than a series of films 
was used. 
Experiment Two 

Method 

Subjects. A total of 64 individuals forming six 
groups participated in this study. The groups varied 
in size from 8 to 13 members. Since they were re- 
cruited from the same types of P. T. A. and child- 
study groups as the Ss in Experiment One, they 
were similar in all important respects to the partici- 
pants in that study. The mean age of the Ss was 
38.3, they had attained an average of 2.5 years of 
education, their mean family income was 


college 
$7,280, and 76% identified themselves as “house- 
wives.” On the nine-point self-rating scale of fa- 


miliarity with mental health problems they achieved 
a mean scale value of 4.1. 

Materials. The films, biographical questionnaire, 
and Mental Health Opinion Inventory were the same 
as those used in the first experiment. In this first 
instance, however, the films were shown singly rather 
than as a series. 

Procedure. In order to make the interval be- 
tween administrations of the opinion inventory com- 
e to that of the previous investigation, the first 
as done at a regularly scheduled meeting of 
each group. The experimental sessions were held 
one month later, at the conclusion of which the post- 
treatment measures were taken. Each of the three 
films previously described was shown to a different 
group, the members of which engaged in discussion 
of the film with the same leader who had served in 
Experiment One. At the conclusion of the discus- 
sions they filled out the opinion inventory to which 
they had first responded four weeks earlier. 

The remaining three groups each viewed one of 
the films and were posttested on the questionnaire 
without discussion. Since the previous study had 
shown that no opinion changes could be expected in 
groups which were not experimentally treated, it was 
considered unnecessary to include this type of con- 
trol in the design. The principal concern of this 
study was to determine the effectiveness of a single 
film with and without discussion upon changes in 
opinions and beliefs about mental illness. Since the 
same three films were used as in the first experiment 


parabl 
testing W: 


44 


Elliott McGinnies, Robert Lana, and Clagett Smith 


Table 3 


Pretest, Posttest, and Difference Scores for All Groups 


ttest 
5 Pretest Posttes Wian 
Group M SD M SD Difference 

Film A*-discussion 165.3 23.9 171.4 18.9 6.1 
N = 13 

Film A-alone 158.1 215 158.5 22.2 A 
N=8 

Film B>-discussion 169.9 19.8 171.8 17.4 1.9 
N= 13 

Film B-alone 155.0 15.6 169.5 17.2 14.5 
N=8 

Film Ce-discussion 168.5 18.7 167.4 19.5 —1.2 
N=11' 

Film C-alone 171.3 18.4 175.9 16.4 4.6 


N=11 


* Film A = The Feeling of Hostility. 
b Film B = Breakdown. 
€ Film C = The Feeling of Rejection. 


and were randomly assigned to the participating 
groups, it was expected that any differences in their 
relative effectiveness would appear in the pre- and 
posttreatment opinion measures. Any interaction be- 
tween number of films shown and opportunity for 
discussion by the audience members would be re- 
vealed in differences between the film-alone and film- 
discussion groups, no such differences having been 
obtained with a three-film series. 


Results 


The initial and posttreatment scores of the 
various groups, together with the mean dif- 
ference scores, are shown in Table 3. It will 
be noted from the table that the range of 
mean group pretest scores is somewhat less 
than in Experiment I, ranging in this instance 
from 155.0 to 169.9. The mean score for 
these Ss is also somewhat lower than for those 
participating in the first study. Since these 
individuals were further from the ceiling of 
the measuring instrument than the Ss in the 
prior experiment, they might have been ex- 
pected to change more readily under persua- 
sive influence, which in this case consisted of 
a single film rather than a series of three 
films. 

In order to discover whether any over-all 
differences existed among the various groups, 


an analysis of covariance was done on the 
opinion-change scores. The F test, as indi- 
cated in Table 4, was not significant at the 
.05 level, and the null hypothesis of no dif- 
ferences among the six groups is accepted. 
To determine whether any of the mean dif- 
ference scores from pre- to posttreatment 
were significantly greater than zero, six analy- 
sis of covariance ¢ tests were performed. Only 
one of the groups, the film-alone group view- 
ing Breakdown, showed a significant change 
in mean opinion score following treatment. 
would be unjustified on the basis of this 02° 
significant finding to conclude that a single 
mental health film, with or without disci 
sion, is capable of influencing opinions abot 


Table 4 


Analysis of Variance on Adjusted Posttest Scores 
(Covariance Method) 


Sum of Mean 
Source df Squares Square F P 
Between 5 924 1985 105 >” 
Within 57 10,774.3 189.0 
Total 62 11,766.7 


———— 


Effects of Sound Films on Opinions 45 


mental illness. Noteworthy, however, is the 
fact that five of the six groups showed opin- 
ion-change scores in the direction supported 
by the films, even though just one of these 
changes is significant. The insignificant F 
test for between-group treatments indicates 
that the discussions added nothing to the 
Measured effectiveness of the films in the 
Present study. 


Discussion 


b Considering the results of both experiments, 
it is apparent that only one of our original 
hypotheses has been confirmed, namely, that 
a series of three mental health films presented 
without discussion will result in greater opin- 
ion change than that generated by a single 
film under the same conditions. In fact, only 
one of three films shown singly was effec- 
tive in modifying scores on a questionnaire 
dealing with opinions about mental illness, 
and this was under nondiscussion conditions. 
While a series of three films proved useful in 
bringing opinions about mental illness more 
in line with professional thinking, discussions 
ollowing each film in the series failed to aug- 
Ment this effect. 

What do these findings imply for the use of 
Mental health films in educational programs 
as well as for questions concerning the sus- 
Ceptibility of opinions and attitudes in gen- 
eral to influence through motion pictures? 
For one thing, the findings in these two studies 
Suggest strongly that a series of films dealing 
With a common topic may be effective modi- 
ers of opinions where a single film is likely 
$ Produce no measurable result. It should 
€ noted, of course, that none of the films 
Used here had a running time of more than 

Minutes, ‘The rather dramatic results in 
attitude change obtained by Peterson and 

Urstone (9) may be attributable in part to 

e fact that they employed full-length Holly- 
mod productions with considerable emotional 
™pact. Educational-indoctrination films, such 


2S those used by the armed forces and those 
lly limited in 


It is not 


scope and dramatic appeal. 
more 


SUT a: 
'prising, then, that the effects of these 
ane vative productions up 
lefs are often difficult to detect, 


single presentation is apt to produce no meas- 
urable changes. 

The failure of active participation to fa- 
cilitate opinion change in the two investiga- 
tions reported here may be due to several 
factors. First, the discussions were per- 
mitted to develop along any lines suggested 
by the group members. In general, the discus- 
sions centered about the characters and plots 
of the film rather than about mental health 
problems in general, so that generalization 
from the films to the more discursive items 
on the questionnaire was probably minimized. 
A second and more probable explanation for 
the failure of the discussions ‘to implement 
the impact of the films is that the films by 
themselves induced as much opinion change 
as might reasonably be expected in audiences 
of this type. The participants in these stud- 
ies were all of superior educational and eco- 
nomic status, as are most active members of 
community groups formed voluntarily for self- 
education. Consequently, their initial reac- 
tions to the questionnaire items were oriented 
in the direction of those expressed by our 
panel of professionals who provided us with 
anchoring points for the scoring system. That 
the group members moved further in this di- 
rection following treatment is a tribute to the 
effectiveness of the films; to expect even 


greater change as a result of group discus- 


sion is perhaps unreal. 

It should not be concluded from these find- 
ings that group discussion has no salutary ef- 
fects in conjunction with film presentations. 
There was noticeable discontent among some 
members of the film-alone groups, who mildly 
resented being dismissed without an oppor- 
tunity to talk about the film that they had 
just seen. Even the promise of an organized 
discussion at the conclusion of the film se- 
ries did not completely allay these complaints. 
It has been our experience that community 
groups welcome a chance for discussion of 
films of this type; but it also is important to 
note that the instructional value of the film 
does not seem to rest upon a related discus- 
sion. Whether discussions of a different type, 
for example, those held under a directive 
leader, would be more effective in influencing 


46 Elliott McGinnies, Robert Lana, and Clagett Smith 


attitude change remains a subject for experi- 
mental determination. , 
A final comment with respect to certain 
methodological problems encountered in re- 
search of this sort may be useful. Sampling 
is necessarily subject to some serious limita- 
tions. It is virtually impossible to assign in- 
dividuals randomly to treatments when deal- 
ing with adults who are under no compulsion 
to appear at scheduled times or to meet in 
inconvenient places. Groups must be located 
and persuaded to participate in the research 
project, and it is frequently difficult to sched- 
ule a series of meetings with the same indi- 
viduals in attendance. While it would have 
been highly desirable to include more groups 
under the several conditions of these experi- 
ments, these practical considerations militated 
against such a procedure. Statistical con- 
trols, therefore, must frequently be exerted 
where experimental controls are lacking. 
Despite these limitations, we feel reason- 
ably confident in concluding: (a) that mo- 
tion picture films shown in a coherent series 
can significantly modify opinions and beliefs, 
and (òb) that a series of mental health films 
shown with or without audience participation 
through organized discussion are effective in 


changing opinions and beliefs about mental 
illness. 


Summary 


Two experiments were designed to evaluate 
the hypotheses that one or more sound mo- 
tion picture films would modify the opinions 
and beliefs of audience members, and that 
group discussion of the films would augment 
this effect. Participants in the studies were 
members of adult community groups. Opin- 
ions were measured before and after experi- 
mental treatment by means of a 47-item ques- 
tionnaire containing statements about mental 
illness and scored by the method of sum- 
mated ratings. 

Results of the two investigations indicated 
that a single mental health film did not pro- 
duce significant changes in opinions toward 
mental illness in groups, regardless of whether 
or not the groups engaged in discussion of 
the films. A series of three films, however, 


induced significant shifts of opinion in the di- 
rections intended by the film content. De- 
gree of opinion change was no greater in 
groups which had discussed the films than in 
groups which had not held discussions. 

The findings have been discussed in terms 
of the types of films employed as well as the 
characteristics of typical audiences for which 
these films are intended. 


Received April 8, 1957. 


References 


1. Bennett, E. B. The relationship of group dis- 
cussion, decision, commitment, and consensus 
to individual action. Disstr. Abstr., 1953, 13, 
444-445. 3 

2. Fearing, F. A word of caution for the intelli- 
gent consumer of motion pictures. The Quart. 
Film, Radio, and Television, Vol. VI, No. 2- 

3. Flanagan, J. C. General considerations in the 
selection of test items and a short method 
of estimating the product-moment coefficient 
from data at the tails of the distribution. J- 
educ. Psychol., 1939, 30, 674-680. 

4. Hoban, C. F., Jr., & van Ormer, E. B. Instruc- 
tional Film Research, 1918-1950, Instruc- 
tional film research program, The Penn. State 

Coll. Tech. rep. SDC 269-17-19. 

- Hovland, C. I, Janis, I. Lọ, & Kelley, H. k 
Communication and persuasion; psychologica 
studies of opinion change. New Haven: Yale 
Univer. Press, 1953, 

6. McGinnies, E. A method for matching anony- 
mous questionnaire data with group discussio” 
material. J. abnorm. soc. Psychol., 1956, 52 
139-140. 

7. National Opinion Research Center. Popular 
thinking in the field of mental health. Uni- 

ver. of Chicago Press, Survey 272, 1950. 

- Nunnally, J. Opinion-attitude factors in the men- 
tal health area. Wash., D. C.: Nat’l Instit. O 
Mental Health, P. H. S., Institute of mer 
munication Res, Prog. Rep., 1954. (Mimeo 

- Peterson, R. C, & Thurstone, L. L. Motio’ 
pictures and the social attitudes of childre™ 
New York: Macmillan, 1933. 5 

Ramsey, G. V, & Seipp, M. Public opinion’ 
and information concerning mental health. 
clin. Psychol., 1948, 4, 397-406. 

Timmons, W. M. Decisions and attitudes 25 
outcomes of the discussion of a social prob- 
lem. Contrib. Educ., N, Y, Teachers Coll 
Columbia Univer. Bur. of Publ, 1939, No 

he 

12. Woodward, J. L. Changing ideas on mental i!l- 


ness and its treatment. Amer. Sociol, Re?” 
1951, 16, 443-454. 


n 


10. 


I 


Journal of Applied Psy ; 
Vol. 42, Nose Hee, Psychology 


The Influence of Noxious Environmental Stimuli on Vigilance ' 


Michel Loeb * and Gabriel Jeantheau ° 


U. S. Army Medical Research Laboratory 


ere Problem of vigilance or sustained at- 
lon to randomly occurring, obscure sig- 
nals has received much attention in recent 
Years, Many studies (1, 2, 5, 10) have been 
for cerned with the asymptotic decline in per- 
i mance with the passage of time, and most 
mieru formulations (3, 6, 10) have dealt 
vigil this aspect of the problem. A decline in 
tien ance has also been observed as a func- 
sti of exposure to noxious environmental 
muli, especially noise and heat (4, 8, 10, 
Decrements in vigilance due to heat 

Fi not been reported as consistently as 

ements due to noise. 
infu deterioration in performance under the 
mi ence of noxious environmental stimuli 
Eht be attributed to a change in the 
ale ological state of the individual, to the 
Sed of responses incompatible with the 
oF to ‘on of stimuli or the reaction to them, 
diieu es in motivation. In practice it is 
tions t to resolve these alternative explana- 
ply i The present study was designed sim- 
noise. investigate the influence of combined 
and oc vibration, of combined heat, noise, 
rm ibration, and of heat alone upon the per- 
ance of a simple monitoring task. 


Procedure 


Ra The monitoring task employed was 
Tha tt to Broadbent’s “Twenty Dials” (4). 
on, Was seated on a bench in an Army 
front Carrier (an armored, tracked truck) in 
The of a board bearing 20 numbered dials. 
amet, lals, approximately two inches in di- 
t, were arranged in two equal and 

LTh 
‘be 
of ae 


S 
®Shegi. a 
8. aay indebted to Lelon A. We 


The authors are 
aver and Bryce 


47 


aligned horizontal rows. A pointer on each 
dial normally pointed to a position within an 
arc, circumscribed by two black marks and 
outlined in blue. 

The S held a rectangular response board on 
his lap. On it were 20 push buttons, num- 
bered and arranged spatially like the dials on 
the display board. It was S’s task, whenever 
a pointer moved outside the circumscribed 
arc, to push the button corresponding to the 
dial. Response times in milliseconds were 
obtained and recorded by Æ. Pointers moved 
at random intervals ranging from one to seven 
minutes. A complete session under experi- 
mental condition consisted of 49 trials and 
lasted for three hours and forty-five minutes. 

Conditions. Every S performed on the task 
four times, under a different environmental 
condition each time. In the stationary night 
(control) condition, the troop carrier was sta- 
tionary, the temperature varied between 65 
and 75 degrees Fahrenheit, the ambient noise 
level varied from 65 to 75 decibels re .0002 
dyne per square centimeter, and there was no 
appreciable vibration. In the moving night 
(noise and vibration) condition, the tempera- 
ture was the same, the noise level was 115 to 
125 decibels, and there was considerable vi- 
bration. (The vehicle noise was random and 
fairly flat up to approximately 5,000 cycles 
per second. No adequate means of measur- 
ing or analysing vibration was available.) In 
the day moving (heat, noise, and vibration) 
condition, the noise and vibration were ap- 
proximately the same as in the night moving 
condition, but the temperature ranged from 
110 to 125 degrees. In the day stationary 
(heat) condition, noise and vibration were 
comparable to that in the night stationary 
condition, but the heat was comparable to 
that in the day moving condition. Humidity 
in all conditions was between 4% and 24%. 

All of the experimental data were collected 
at the Yuma Test Station, Arizona. The 


48 Michel Loeb and 
vehicle moved over a level, pre-established 
course. 

There were 24 possible sequences of the 
four environmental conditions. These were 
arranged in random order and Ss were as- 
signed to them in order of their arrival. 

Subjects. Twelve Ss were recruited from 
the personnel of the Army Medical Research 
Laboratory. They were screened for good 
auditory acuity and generally good physical 
condition. A prize of twenty dollars was of- 
fered to the S with the best overall perform- 
ance on the task. 


Results 


For purpose of analysis the 49 trials were 
divided consecutively into seven blocks of 
seven trials. Each block represented an in- 
terval of approximately 32 minutes. It was 
decided that the median would best represent 
the average performance for each block. 

Figure 1 pictures the results diagrammati- 
cally. It is apparent that in the moving con- 
ditions (which involved noise and vibration) , 
the response times were greater than in the 
stationary conditions. 

Table 1 summarizes the analysis of vari- 
ance. For this analysis the median values 
previously discussed were normalized by a 
logarithmic transformation, The variances be- 
tween blocks of trials, between environmental 
conditions, and between Ss were all much 
greater than would be expected by chance. 
The interaction of Ss with conditions was sig- 
nificant. The interaction of conditions and 


NIGHT MoviNG 
3e8 DAY MOVING 
f —— NIGHT STATIONARY 
KELO —— DAY STATIONARY 
E 
u 324 
= 
tu 29; 
2 
Š 26 
a 
5 
È ze 
z 
E 
Š 196] 
$ 
= 
164 
132 
s © 7 
BLOCKS OF TRIALS 
Fic. 1. 


Median response times on successive blocks 
of trials under different experimental conditions. 


Gabriel Jeantheau 


Table 1 


Analysis of Variance of Log Median Response Times 
on Successive Blocks of Trials 


Mean , 
Source df Square F 
Blocks of trials 6 0.172 3.31" 
Conditions 3 2.600 i 
Subjects 11 0.551 2.64* 
CXB 18 0.077 1.60 
CxS 33 0.209 4,35* 
SXB 66 0.052 1.08 
CXBXS 198 0.048 
*P <05. ` 
“pP <.01, 


blocks fell short of significance at the 0.05 
level. 

A further breakdown of the analysis re- 
vealed that the significant variance betwee? 
environmental conditions was largely attrib- 
utable to the difference between the moving 
and stationary conditions. The difference be 
tween the means of the day and night condi- 
tions was not significant. Variance betwee? 
blocks of trials was largely attributable to thé 
large increases in median response times oF 
the second and third blocks of the day mov 
ing condition (see Figure 1), These increases 
also explain the nearly significant interactio” 
of conditions and blocks of trials. 


Discussion 


The large, very significant increases In i” 
sponse times in the moving conditions pr 
sumably reflected the influence of exposure 
noise or vibration or both. The data ano 
no information as to the relative contribut! 
of these stimuli to the observed perform 
decrement. These changes, as well as 
nonsignificant overall temporal changes, C% ; 
roborate Broadbent’s findings (4). sa 

It is especially interesting that when ~" ' 
were exposed to heat alone, in the day 5%, 
tionary condition, no significant decrement 
sulted, but that when Ss were exposed to he A 
noise, and vibration in the day moving on 
dition, a significant though transitory ad 
tional decrement was produced. e 

No final conclusion as to the nature of tea 
effects may be drawn. Conceivably the gi 


anc 
the 


at 


Influence of Noxious Environmental Stimuli on Vigilance 49 


ferential effects of the noise and vibration 
and the heat might reflect the presence of dif- 
ferent physiological or psychological mecha- 
nisms, but this is by no means certain. A 
more crucial exploration of the problem 
should be undertaken. 


Summary 


Twelve Ss in an Army troop carrier were 

asked to detect and respond to obscure, ran- 
domly occurring signals under each of four 
field conditions. In the control condition, 
the noise and heat levels were moderate and 
the vehicle was stationary. During the noise 
and vibration condition, the vehicle was mov- 
ing, noise and vibration were considerable, 
and the temperature was moderate. In the 
heat condition, the heat was rather intense, 
the noise was moderate, and the vehicle was 
Stationary. During the heat, noise, and vi- 
bration condition, the vehicle was moving, 
and the noise, vibration, and heat levels were 
rather high. 
_ Noise and vibration produced by the mov- 
ing vehicle appreciably increased the median 
Tesponse times of the Ss. Further decrement 
Occurred when heat was combined with noise 
and vibration, but the effect was relatively 
transitory, Heat alone had no apparent ef- 
ect. Changes occurring as a function of 
lapsed time were not apparent. 


Received April 15, 1957. 


w 


10. 


References 


. Adams, J. A. Vigilance in the detection of low- 


intensity visual stimuli. J. exp. Psychol., 1956, 
52, 204-208. 


. Baker, P. Discrimination decrement as a func- 


tion of time in a prolonged vigil. J. exp. Psy- 
chol., 1955, 50, 387-390. 


. Berlyne, D. E. Attention, perception, and be- 


havior theory. Psychol. Rev., 1951, 58, 137- 
146. 


. Broadbent, D. E. Noise, paced performance, 


and vigilance tasks, Br. J. Psychol., 1953, 
44, 295-303. 


. Deese, J., & Ormond, E. Studies of detectabil- 


ity during continuous visual search. WADC 
Tech. Rep., 1953, Rep. No. 53-8. 


. Deese, J. Some problems in the theory of vigi- 


lance. Psychol. Rev., 1955, 62, 359-377. 


. Fraser, D. C. The relation between angle of dis- 


plays and performance on a prolonged visual 
task. Quart. J. exp. Psychol, 1950, 2, 176- 
181. 


. Fraser, D. C., & Jackson, K. F. Effect of heat 


stress on serial reaction time in man. Nature, 
1955, 176, 976-977. 


. Kryter, D. D. The effects of noise on man. J. 


Speech Hearing Dis., 1950. Monograph sup- 
plement. 

Mackworth, N. H. Researches on the measure- 
ment of human performance. London: His 
Majesty’s Stationery Office, 1950. (Medical 
Res. Council Spec. Rep. Ser. No. 268.) 


. Pepler, R. D. The effects of climatic factors on 


the performance of skilled tasks by young 
European men living in the tropics. IV. A 
task of prolonged visual vigilance. Appl. Psy- 
chol. Unit Rep. 1953, Rep. No. RNP 53/751. 


Journal of Applied Psychology 
Vol. 42, No. 1, 1958 


A Study of the Value of the Owens-Bennett Mechanical 
Comprehension Test (Form CC) as a Measure of the 
Qualities Contributing to Successful Performance 
as a Supervisor of Technical Operations in an 
Industrial Organization 


Robert L. Decker? 


West Virginia University 


The problem with which the present study 
is concerned is that of selecting and placing 
individuals in supervisory positions in the 
technical operations or departments of a com- 
pany. Such operations would include the de- 
sign and development of mechanical and proc- 
ess equipment, production supervision, and 
the study and improvement of production 
methods. An analysis of the duties and re- 
sponsibilities of men working in these areas 
suggests that some quality which might be 
termed mechanical aptitude should be a fac- 
tor in successful performance on the job. 
Specifically, the object of the present investi- 
gation was to determine whether or not the 
qualities measured by a widely used (1) test 
of mechanical aptitude, Mechanical Compre- 
hension Test—Form CC (10), are related to 
successful performance in these activities. 

Some degree of success has been attained in 
finding relationships between mechanical apti- 
tude test scores and job performance (3, 6, 
11) where the performance is largely motor. 
However, the existence of a special aptitude, 
as distinct from general aptitude or intelli- 
gence, for performing in more complex types 
of situations such as those faced by the engi- 
neer has not been satisfactorily demonstrated. 
In 1942 Bennett and Cruikshank (2), after a 
review of the findings of research workers 
concerning the validity of mechanical apti- 
tude tests in predicting engineering school 
success, concluded that “Some of them ap- 
proach some degree of usefulness, but in gen- 
eral the tests used are not thoroughly depend- 
able touchstones for predicting engineering 

1 The author wishes to 
Richard S. Uhrbrock 


help in the formulatio 
ment of the data. 


express his appreciation to 
for his kind suggestions and 
n of the problem and treat- 


school success.” A review of the studies since 
then suggests that this evaluation may still 
apply today. x 

Owens reports that Form CC of Mechani- 
cal Comprehension Test was developed with 
the intent that it would be a test of me- 
chanical comprehension with items of suffi- 
cient difficulty to measure higher levels of 
mechanical aptitude, i.e., the degree of apti- 
tude needed for successful performance in en- 
gineering courses or in engineering positions 
after graduation (9). On the basis of the re- 
sults obtained when the test was adminis- 
tered to 725 incoming freshmen, Owens con- 
cludes that the Form CC scores were making 
a significant independent contribution in thé 
prediction of engineering school grades. Al- 
though this conclusion may be supported bY 
the obtained data, it seems that the amount 
of this contribution was so small that con- 
sideration of scores on the test in addition t° 
general aptitude test scores in selecting stu- 
dents or employees might not be justifiable 
considering the added time and expense i” 
volved. 

In another study designed to determine the 
relationship between scores on Mechanica 
Comprehension Test (Form CC) and grades 
in engineering school (7), using 130 freshme? 
Students, a product-moment correlation © 
about + .40 was obtained with first yea! 
grades. No attempt was made to determin® 
the degree to which these scores were meas 
ures of general aptitude, i.e., intelligence. 

Assuming that, as Owens (9) suggests: 
Form CC does measure a higher level of Me- 
chanical comprehension or aptitude, it see™* 
logical to hypothesize that scores on this test 
should be related to performance in a tech 
nical or engineering Position in industry: 


The Owens-Bennett Mechanical Comprehension Test 51 


The purpose of the present investigation is to 
determine whether or not such a relationship 
exists. The procedure will involve an item of 
analysis of the test in question using both in- 
ternal and external criteria. 


Procedure 
The Test 


The test used was the Mechanical Comprehension 
Test—Form CC (10). This form contains 60 multi- 
Ple-choice items designed to sample the examinee’s 
ability to understand mechanical relationships and 
the results of forces acting upon objects. For ex- 
ample, in some of the items the examinee is pre- 
sented with drawings showing different possible de- 
Signs of a mechanical device and asked to identify 
the one which would give the greatest mechanical 
advantage. The test was administered to the Ss of 
the present study in groups of 25 or less and ac- 
cording to the directions of the authors of the test 
(10). ‘The Ss indicated their answers on IBM an- 
Swer sheets designed specifically for this test. The 
Score used was the total number right. 


The Sub jects 


The Ss of the investigation were 208 members of 
Supervision in a large manufacturing organization. 
1 were working in the departments of the com- 
Pany concerned with manufacturing or applied re- 
Search, The divisions of the company represented 
by the Ss were production supervision, industrial en- 
Sineering, and equipment design. All Ss were col- 
ege graduates employed by the company during the 
N-year period prior to the study. The mean length 
Of service at the time of the study was approxi- 
ne four and one-half years. Eighty per cent of 
e Ss held degrees in engineering or the physical sci- 
ences, Twenty per cent of the men who were en- 
Saged in the industrial engineering and production 
Activities had liberal arts degrees. All the Ss had 
een hired according to similar standards in a highly 
€veloped selection program. The selection pro- 
Cedure included a test of general aptitude developed 
the company, Most of the men had been hired 
™mediately upon graduation from college. All had 
articipated in well planned, on-the-job training 
Programs to prepare them for their particular duties. 


The Criterion 


The measure of performance used in the present 


ae was a rating of each S by his immediate a 
ina The rating scale used was designed aud > 
of ed by a consulting organization for the pul pose 
sonny Wating the performance of supervi Pae 
arene (8). Tt consists of a series of 60 scale ae 
Sp ts describing a supervisor's performance. m 
a to each statement the rater indicates w ou 
OR Tr it applies to the ratee by checking either aa 
ari rue” or “Not True at Present.” The stateme 

© so worded that in one-half the items Yes or 


True” is the favorable response while in the other 
half “Not True at Present” is favorable. The state- 
ments deal with such areas of performance as pro- 
ductivity, dependability, accuracy of work super- 
vised, relationships with associates, etc. In the scor- 
ing of the scale the statements are weighted with 
values ranging from 1 to 3. These weights were 
assigned to the various statements by the authors of 
the scale according to the statements’ power to dis- 
criminate between good and poor supervisors in the 
original standardization group. The criterion meas- 
ure for each S in the present study was the total 
raw score, ie., the total of the weights of the items 
checked favorably, on the rating scale. Since this 
measure includes evaluations of performance in so 
many areas of the position it is undoubtedly fac- 
torially complex. This would tend to reduce its 
correlation with any other measure. However, in 
the opinion of the present author the technical or 
engineering competence of the individual should be 
a major factor in his overall success on the types of 
work being performed by the present Ss. 

As a further check on the acceptability of this 
scale the present author computed a split-half re- 
liability coefficient based upon the 208 rating scales 
used in this investigation. Scores on the first 30 
statements in the rating scale were correlated with 
those for the second half of the rating scale. Ap- 
plication of the Spearman-Brown formula resulted 
in a corrected reliability coefficient of + .8986 (5). 
This was considered acceptable. 


Results 


The criterion measures (weighted scores on 
the rating scales) ranged from 16 to 107 with 
a mean score of 71.13 and an SD of 21.25. 
The raw scores on Mechanical Comprehension 
Test ranged from 19 to 60 with the mean at 
48.81 and an SD of 6.81. The computation 
of a product-moment correlation between the 
criterion measures and test scores yielded an 
r of .074. This coefficient is not significant 
at the 1% level of confidence (5). 

Item validity was measured by the biserial 
r between the response to each item and the 
criterion measures. The numerical values of 
the item validity coefficients ranged from .00 
to .37 with the median coefficients (without 
regard to sign) being .08. Coefficients of cor- 
relation significant at the 1% level of confi- 
dence were obtained for fourteen of the 60 
items on the test. These items and their va- 
lidity coefficients are given in Table 1. A 
significant negative correlation was obtained 
for three of the items. 

The obtained biserial r’s between individual 
items and total scores on the test were all 


52 Robert L. Decker 


Table 1 


Item Validity Coefficients 
Significant at 1% Level 


Item No. Item Validity 
3 —.18 
5 or 
6 25 
7 —.33 

10 .18 
15 .25 
28 -25 
29 37 
41 :25 
47 34 
49 —.25 
Si .22 
52 28 
54 -20 


positive and ranged from .85 to .01 with the 
median coefficient being + .50. Only two of 
the items, numbers 7 and 38, failed to show a 
significant positive correlation with the total 
score. 

The item difficulty measures ranged from 
1% to 84%, i.e., the easiest item was missed 
by 1% of the group while the most difficult 
Was answered incorrectly by 84% of the Ss. 
The median item difficulty was 13. 

The answer sheets were rescored on the ba- 
sis of the eleven items for which significant 
positive correlations with the criterion were 


obtained. A Product-moment coefficient of - 


correlation computed between total number 
right of the eleven items and the criterion 
measures yielded a coefficient of 31. This is 
significant at the 1% level of confidence, 


Discussion 


The obtained 7 of .074 between perform- 
ance measures and scores on Mechanical Com- 
prehension Test Suggests no basis for assum- 
ing that relationship exists between the quali- 
ties measured by the total score on this test 
and success as a supervisor in a technical in- 
dustry. The fact that the validity coefficients 
obtained for the Majority of the individual 
items are not significant and that the 14 co- 
efficients which are significant are low fur- 


ther supports the conclusion that the meas- 
ures are unrelated. When the answer sheets 
of the Ss were rescored for total number right 
for the eleven items with positive validity co- 
efficients significant at the 1% level and these 
scores correlated with criterion measures, the 
obtained 7 was + .31. This is significant well 
beyond the 1% level. Since there are only 11 
valid items it does not seem justifiable to sug- 
gest that the test be used in its present form 
and scored on the basis of this small number 
of items alone, especially without cross vali- 
dation of these items with another group of 
Ss. The results obtained in the present study 
suggest that before the test could be recom- 
mended for use in industrial situations simi- 
lar to those of the present study the authors 
should conduct further research to determine 
what the characteristics of the valid items 
are which cause them to select satisfactorily. 
In particular, items 3, 7, and 49, which show 
significant negative correlations, should re- 
ceive attention. 

The measures of item difficulty suggest that 
the test is too easy for a group of Ss having 
the training and background of those used in 
the present study. It is true that the Ss of 
this study were college graduates who had 
been carefully selected before employment on 
the basis of general ability and personal char- 
acteristics. However, if mechanical aptitude 
exists independent of these factors the test 
should have shown some relationship to per- 
formance on the job. 

The fact that moderate to high correlations 
with total score were found for most of the 
items in Mechanical Comprehension Test sug- 
gests that the test is consistently measuring 
some quality. An inspection of the items sug- 
Sests the hypothesis that the quality may be 
a type of nonverbal intelligence or judgment- 
Studies such as those of Bruce (4) and Sar- 
tain (12) would seem to support this. I” 
both of these cases significant positive cot 
relations were obtained with Form AA of 
Mechanical Comprehension Test and tests 
ty. A reasonable further by- 
be that, based on the fact the 


easure of general intelli- 


Sence at lower levels, This possibility might 


a 


The Owens-Bennett Mechanical Comprehension Test 53 


account for the correlations which have been 
obtained between scores on the test and prog- 
ress in training programs and on certain types 
of jobs (3, 11). 

A percentile table constructed for the Ss 
used in the present study did not appear to 
differ markedly from the one given by Owens 
and Bennett for college seniors in engineering 
Courses (10). The mean raw score for the 
engineering school seniors is 47.00 while the 
mean for the present group of Ss is 48.81. 
The 75th percentile for the college seniors is 
at a raw score of 51 while that for the Ss 
would be approximately 54. The 25th per- 
Centile for seniors is at raw score 43 while 
that for the Ss would be about 46. There is 
no indication that there is a substantial dif- 
ference between the college seniors who were 
members of the standardization group and 
the present group of Ss in terms of the qual- 
ity being measured by Mechanical Compre- 
hension Test. 


Summary and Conclusions 


Two hundred and eight members of su- 
Pervision employed in the technical operations 
of a large manufacturing organization took 

echanical Comprehension Test (Form CC) 
and were rated on performance as a super- 
visor by. their superiors. The statistical 
analyses of the data included measures of 
Overall test validity and an item analysis with 
Measures of item difficulty, item validity, and 
internal consistency. Test records were re- 


Scored on the basis of the 11 items found to. 


be significantly related to the criterion. These 
Scores were then correlated with the criterion 
Measures. A percentile table based upon the 
Population of which the present Ss are a sam- 
ble was constructed. pon 

Based upon the data of the present investi- 
8ation the following conclusions seem war- 
Tanted, 

1. The qualities measured by Mechanical 
Comprehension Test (Form CC) are not re- 
ated to successful performance as 4 super- 


visor under the conditions of the present 
study. 

2. Form CC of Mechanical Comprehension 
Test is not sufficiently difficult for use with 
college graduates having the training and 
backgrounds of the Ss of the present study. 

3. The items of this test are consistently 
measuring some quality. It is hypothesized 
that this is an aspect of general intelligence 
or judgment rather than anything “mechani- 
cal.” 


Received April 29, 1957. 


References 


1. Baker, Gertrude, & Peatman, J. C. Tests used 
in Veterans Administration advisement units. 
Amer. Psychologist, 1947, 2, 99-102. 
2. Bennett, G. K., & Cruikshank, R. M. A sum- 
mary of manual and mechanical ability tests. 
New York: Psychological Corporation, 1942. 
3. Bennett, G. K., & Fear, R. A. Mechanical com- 
prehension and dexterity. Personnel J., 1943, 
22, 12-17. 
4. Bruce, M. M. The prediction of effectiveness as 
a factory foreman. Psychol. Monogr., 1953, 
Vol. 67, No, 12 (Whole No. 362). 
. Garrett, H. E. Statistics in psychology and 
education. New York: Longmans, Green, 
1948. 
6. Goodman, C. H. The MacQuarrie test for me- 
chanical ability: I. Selecting radio assembly 
operators. J. appl. Psychol, 1946, 30, 586- 
595. 
. Halliday, R. W., Fletcher, F. M., & Cohen, Rita 
M. Validity of the Owens-Bennett Mechani- 
cal Comprehension Test. J. appl. Psychol., 
1951, 35, 321-324. 
8. King, J. E., & Wingert, Judith W. Merit rating 
series—performance-supervisor. Tucson, Ariz.: 
Industrial Psychology, Inc., 1953. 
9. Owens, W. A., Jr. A difficult new test of me- 
chanical comprehension. J, appl. Psychol., 
1950, 34, 77-81. 
10. Owens, W. A., Jr, & Bennett, G. K. Manual, 
Mechanical Comprehension Test. New York: 
The Psychological Corporation, 1949, 
11. Patterson, C. H. The prediction of attrition in 
trade school courses. J. appl. Psychol., 1956, 
40, 154-158. 

12. Sartain, A. Q. Relation between scores on cer- 
tain standard tests and supervisory success in 
an aircraft factory. J. appl. Psychol., 1946, 
30, 328-332. 


uw 


Journal of Applied Psychology 
Vol. 42, No. 1, 1958 


Reaction Time in the Cold 


Warren H. Teichner + 


Quartermaster Research and Development Center 


As reported by Forlano (3) and by Teich- 
ner (7) studies of the effects of cold envi- 
ronments on the simple reaction time (RT) 
suggest that RT is not affected by low ambi- 
ent temperatures down to — 50° F. How- 
ever, temperature is only one of the factors 
which make up cold environments. The cool- 
ing power of air actually depends on both its 
temperature and speed of movement (wind- 
speed). The effect of each of these, singly 
and in combination, must be studied before 
safe generalizations can be made about the 
effect of the cold on RT. Further, the com- 
bined action of temperature and wind (wind- 
chill*) in determining the cooling rate of 
exposed bodies has been formulated quantita- 
tively; thus, there is a basis for a rational ap- 
proach to the combined effects problem. It 
was the concern of the present investigation, 
therefore, to study the effects of the cold on 
RT through variation of all three physical 
factors. 

As long as Ss wear protective clothing, as 
they have in previous studies, S-R relation- 
ships may be misleading. That is, with no 
information beyond the stimulus conditions 
and S’s response, it is not possible to deter- 
mine whether the environment was actually 
effective in cooling the body. Failure to find 
a temperature effect in previous studies may 
have been the result of lack of actual body 
cooling. Thus, studies which fail to measure 


1 Now at the University of Massachusetts. 

? Windchill is a measure of that part of the total 
cooling of a body due to the action of wind. The term 
is not usually applied to temperatures 
Values of windchill used in this study 


from reference (6) based on Siple an 
formula: 


above freezing. 
were obtained 
d Passel’s (5) 


Ko = (Vwo + 100 + 10.45 — wa) (33 — T) 
where: 


Ko = Total cooling in kilogram calories 
meter per hour, 
wo = Wind velocity in meters per second, 
a = Air temperature in degrees centigrade. 


per square 


54 


body cooling cannot yield information of gen- 
eral value nor are the results amenable to 
theoretical considerations, either physiologi- 
cal or psychological. The present study was 
designed, therefore, to obtain body surface 
temperatures for relationship to the effects of 
cold environments. 


Method 


Six hundred and forty infantrymen from Fort 
Devens were used as Ss. 20-man groups were used, 
one per day until the total number was exhausted. 
On arrival at the laboratory the 20 Ss were ran- 
domly sorted into five-man subgroups and each op- 
cration of the investigation was phased to handle 
the sequential appearance of the four subgroups. 
Two subgroups were studied before the noon meal 
and two after it. Twenty Ss were eliminated for 
medical reasons prior to starting. 

Ss were taken to a dressing room (55-60° F.) 
which interconnected with the climatic chamber, 
where they undressed, a multi-point thermocouple 
harness was put on them and they were dressed in 
appropriate clothing. These procedures were per- 
formed “by the numbers” so that all five Ss were 
dressed at the same time, thus avoiding individual 
overheating. While Ss were in the dressing room 
standard instructions were read to them which ex- 
plained the details of the procedure to follow. When 
dressed, they were taken into the climatic chamber 
which was pre-set for the appropriate environmental 
conditions. 

In the chamber, the five Ss sat side by side about 
three ft. apart, before a long table in front of @ 
large observation window. They faced sideways tO 
the direction of air movement and were in front 
view of technicians operating the equipment out- 
side of the chamber. From their positions, howeve!s 


Ss were unable to observe the operation of the 
equipment. 


On compl 
Ss were seated and 25 success; 


10 min. Follow- 
ran in place slowly for three min. (mild 


Each S was 


provided with 
to the table, 


a Morse key fastened 
At a verbal read 


y signal, Ss closed the 


Reaction Time in the Cold 55 


keys with their preferred hands. At the reaction 
signal which was provided by 100 w. lights mounted 
Opposite them, they removed their hands from the 
keys as quickly as possible and rested them on the 
table. Standard Electric Timers provided a .01 sec. 
recording of the times between the simultaneous clos- 
ing of each of the five simple circuits (onset of 
lights) and the individual reopening of each circuit 
as the Morse keys were released. 

Ten thirty gauge copper-constantan thermocouples 
Were taped to different parts of the body of each S. 
The output of each thermocouple was recorded by a 
Leeds and Northrup recording potentiometer; they 
Were also automatically weighted according to the 
Percentage of total body surface area each repre- 
Sented, integrated electronically and recorded as a 
Measure of mean weighted skin temperature. The 
ten thermocouple placements and their respective 
Percentage weights are shown in Table 1. In view 
of the lack of familiarity of the Ss with the situa- 
tion, it was not deemed advisable to obtain rectal 
temperatures although these would have been highly 
desirable, 


Table 1 


Placement of Thermocouples and Associated 
Percentage Weights * 


SS ~ == e 
Position Weight 
eSak = a 
Instep .050 
Calf .150 
Lat. thigh .125 
Med. thigh „125 
Back .125 
Chest .125 
Upper arm .070 
Lower arm .070 
Hand .060 
Cheek 100 
i 10 


“Mean Weighted Skin Temperature = Z (Poalen 
x Weight), 


recorded in 


The out couples was 
put of the thermocoup! omplete de- 


Sequence at a rate which provided a C A 
feuPtion of each S’s skin temperature once every 
pur min, In addition, the output of an electronic 
tnlos-to-digital computer 3 working off the arma- 
ize of the potentiometer was fed to a No. 523 WM 
pummary Punch, Thus, the skin temperatures W ere 
‘mediately available for IBM processing- The po- 
centimeter recordings were used as a measa e A 
wE the skin temperatures of the 95- men 
sible removed from the experiment as soon 

“le after an extremity dropped to 38° ¥- an 
of fine experimental plan called for a 2 X $ a o 
be emperature and windspeed, a number “7 

ratures at constant windspeed and two group: 


$ ate i vstem. 
G. M. Giannini, Datex Digital Encoding Syste: 


Ss at 60° F., one lightly clothed (fatigues) and one 
group nude (shorts and socks). Other than these 
two groups, all other Ss wore a complete standard 
Arctic uniform. Difficulties in keeping Ss safely 
above frostbite level at the higher windchills re- 
quired some modifications in experimental plans. 
The actual conditions used and the numbers of Ss 
who started and finished are shown in Table 2. 


Results 


Data were used only from Ss who com- 
pleted the experiment. Each RT was trans- 
formed to its reciprocal and all results were 
treated in terms of the transformed measure. 
Since this was a reciprocal of time, it may be 
thought of as an index of speed of reaction 
and will be called reaction speed (RS). In- 
spection of successive reactions in each group 
did not suggest any trend within the reaction 
series, either one suggesting performance in- 
crements or decrements. For this reason the 
mean RS was obtained for each S for the 25 
responses in the first series. These values 
were used as the basis for determining all 
effects. 

A plot of RS vs. ambient temperature at 
constant windspeed of five mph showed very 
little variation among the mean values. An 
analysis of variance of the temperature effect 
based on these data yielded an F ratio of less 
than 1.00 which confirms the conclusion that 
these temperatures had no significant effect. 

An analysis of variance of the 2 X 4 fac- 
torial represented by temperatures of — 15° 
F. and — 35° F. at windspeeds of 5, 10, 15, 
and 20 mph is presented in Table 3. The 
unequal frequencies of this factorial were 
treated as described by Rao (4); the sum- 
mary table also follows Rao. Evaluation of 
the wind-temperature interaction mean square 
provided by Table 3 indicates that these two 
factors did not interact significantly in their 
effects. An F of 12.21 was obtained for the 
temperature effect and of 4.88 for the wind 
effect. Both of these are significant at less 
than the .01 level of risk. Thus, it may be 
concluded that both the temperatures and 
the winds involved in this analysis had sig- 
nificant, independent effects on RS. i 

When the results summarized in Table 3 
are considered along with the finding of no 
temperature effect at the 5 mph windspeed, 


56 Warren H. Teichner 
Table 2 A 
Experimental Conditions 
i No. Subjects P 
ee Windspeed Wind chill ek EEL aem, 
È F) (mph) (Kg. Cal./m.?/hr.) Start inis 
5 5 59 Fatigues 
60 3 ie 118 Nude a 
T 3 780 100 100 Arctic 
0 5 1,166 40 40 Arctic 
—15 5 1,359 39 39 Arctic 
—15 10 1,609 40 40 Arctic 
=15 15 1,765 38 37 Arctic 
—15 20 1,873 40 27 Arctic 
—15 30 2,018 18 3 Arctic 
—25 5 1,488 17 ils Arctic 
—35 5 1,617 20 19 Arctic 
—35 10 1,914 19 19 Arctic 
—35 15 2,100 37 25 Arctic I 
—35 20 2,288 35 12 Arctic 
an interaction of wind and temperature is ping more rapidly. However, this tenet 
suggested, but it is one which is not statisti- pears to flatten off or actually rise a i a 
cally testable in the present experiment. To after the initial large drop. This, as we 
examine this possibility further, Fig. 1 was 


Prepared. This figure shows the effects of 
windspeed at — 15° F, and — 35° F.; it pre- 
sents the mean values for the present data 
upon which the analysis of Table 3 was based, 
and, in addition, it presents the result obtained 
with the 30 mph wind at — 15° F, Inspec- 
tion of this figure reveals that the differences 
between the effects of the two temperatures 
were large except at 5 mph where it is known 
that the indicated difference is not reliable. 
Both trends show decreasing RS with increas- 
ing windspeed, the curve for — 


the fact that the other trend appears oe 
what positively accelerated, does suggest 
interaction of temperature and wind. Hoy i 
ever, as noted, no interaction is demonstrab! f 
Figure 2 presents pre- and postexerc $s 
mean RS as a function of windchill and a 
presents the mean values for the two 60 a 
groups. It can be seen that the RS Wi f 
slightly, but consistently greater followin 
exercise than before it. Both trends shot 
in this figure exhibit a decrease in RS aa 
increasing windchill. The lowest windc 


35° F. drop- result shown, 780 Kg.Cal./m.?/hr., may 
Table 3 
Analysis of Variance of Ambient Temperature and Windspeed Effects 
Sum of Mean Mean Sum of | 
Source df Squares Square Square Squares gf Source f 
Wind ignoring temperature 3 8.22 2.74 5.58 5.58 1 Temperature 
. ing W 
Interaction 3 1.51 .50 Ha 
Temperature 1 6.72 6.728 2.682 8.03 3 Wind 
Between cells 7 15.11 
Within cells 210 116.08 255; 
Total 217 131.19 a 
“p <o. 


Reaction Time in the Cold 57 


— -I5°F 
a: i= = 35°F 
è 44 
V 
o 43 
S 42 
= äl 
= 40 
& 3.9 
s 
& 38 
s 3.7 
œ 


3a w 5 20 es eù 
Windspeed (MPH) 
Fic. 1. Effects of windspeed on reaction speed. 


suspected of not representing an effect clearly 
relatable to windchill. This group wore the 
arctic clothing in a relatively mild condition 
and there is some possibility, therefore, that 
the result obtained was due at least partly to 
a heat stress rather than anything that might 
be called cold. Support for this possibility 
may be found in Table 4 which presents the 
mean skin temperatures during the reaction 
series and which shows that this group was 
the warmest of all groups. For this reason, 
it does not seem safe to include the result ob- 
tained with this group in the general wind- 
chill trend. Figure 2 also shows that there 
was no essential difference in RS between the 
nude and clothed groups at 60° F. before 
exercise and only a very small difference be- 
tween them after exercise. 

An analysis of variance Ot 
effect shown in Fig. 2, omitting the lowest 
Windchill group, was carried out on the pre- 
exercise data, This analysis provided an F 
of 5.96 which with 10/364 df is significant at 
less than the 01 level. It may be inferred, 


e of the windchill 


AD ——_ Prevenercise 
= Post-exercise 


Reciprocal RT (17.10 Sec) 
5 
8 


we 
a 
o 


ays eee 
i800 2000 2200 

1000 1200 1400 1600 

se Windchill (Kg Cal /m?/hr-) 


n reaction speed. 


30 
= 60°60" 
Clothes Nude 


Fic. 2. Effects of windchill o 


Table 4 


Mean Weighted Skin Temperature Per Group During 
Pre-Exercise Series and Correlation with RS 


Mean 
Windchill Temperature 
Kg.Cal./m2/hr. Ne °F Try 
780 56 90.85 035 
1,166 29 87.92 086 
1,359 32 86.62 —.005 
1,488 14 85.78 125 
1,609 20 85.50 —.199 
1,617 12 84.98 —.041 
1,765 31 85.74 306 
1,873 18 84.37 019 
1,914 16 83.55 118 
2,100 20 82.76 305 
2,228 10 81.22 —.494 
60° F. 
Clothed 35 86.73 218 
60° F. 
Nude 66 80.24 3324+ 


a Number of subjects on whom complete skin temperature 
data were recorded during reaction series. 
Dp <01. 


therefore, that the decrease in RS with in- 
creasing windchill suggested by Fig. 2 repre- 
sents a nonrandom effect. 

Further inspection of the pre-exercise re- 
sults in Fig. 2, omitting the lowest windchill 
value, suggests that the relationship between 
RS and windchill may be closely approxi- 
mated by a linear function. The least squares 
fit of such a function is given by Equation 1: 


RS = 5.59 — .000813W (1) 
where: 


RS = reciprocal RT in sec. 
W = windchill in Kg.Cal./m.?/hr. 


The standard error of fit of Equation 1 is 
.26. The constant, 5.59, which limits the in- 
tercept is equivalent to an RT of .18 sec. 
which is in good accord with the magnitude 
of visual RT to be expected under ideal con- 
ditions. Thus, the equation, though approxi- 
mate, appears to have a reliability and va- 
lidity of value for practical approximation 
purposes. 

Skin temperatures were available for all Ss 
but due to the scanning procedure described, 


58 Warren H. Teichner 


mean values were available for only 359 Ss 
during the first reaction series and a very 
small number of Ss during the second series. 
The mean weighted skin temperature of each 
group and the number of Ss on which the 
mean was based are shown in Table 4. It 
may be seen that the range of the group 
means was relatively small, and that skin tem- 
perature decreased, in general, with increased 
windchill. It can also be seen that the nude 
men as a group had the lowest skin tempera- 
tures of all. 

In order to study the possible relationship 
of RS to skin temperature, Pearson correla- 
tions were computed for each of the condi- 
tions of Table 4. The results are also shown 
in Table 4. All of the coefficients obtained 
were low and only one was significant in a 
probability sense. Assuming that all 13 co- 
efficients are estimates of a zero correlation 
we may ask of the probability of obtaining 
one significantly different from zero at the 
.01 level. This probability is .12 which is too 
high for rejection of the hypothesis. Thus, 
the correlation among the nude men cannot 
be accepted with confidence. In addition to 
the correlations shown in Table 4, a correla- 
tion was computed based on all 359 Ss. A 
coefficient of .18 was obtained which is si 
nificant at the .01 level, 
nificance of this correlation is presumably re- 
lated to the significance of the correlation 
obtained with the nude men and, therefore, 
it cannot be accepted with any confidence, 


g- 
5 
However, the sig- 


Discussion 


The results are bias 


ed in the sense that an 
increasing percenta; 


ge of individuals suscep- 
tible to frostbite were removed from the ex- 


periment as the conditions became more 
severe. Nevertheless, a clear and systematic 
impairment of performance was demonstrated, 
an impairment that could not have been less 
and would probably have been greater had 
these Ss not been removed. A further qualifi- 
cation must be made, that all conclusions 
apply to “unacclimatized” men, within about 
75 min. of exposure, not suffering Physiologi- 
cal distress. With these qualifications the re- 
sults indicate that RT is not affected by low 
ambient temperature, at least down to -= 35e 


F., providing the windspeed does not exceed 
about five mph. On the basis of previous 
conclusions (3, 7), the lower limiting tem- 
perature at low windspeed may be inferred 
to be less than this, at least — 50° F. On 
the other hand, for windspeeds of 10 mph and 
greater, RS decreases with decreased tempera- 
ture at least from — 15° F. and below. It 
was also shown that windspeed has a marked 
effect on RS at least at temperatures of — 15 
F. and below. Finally, it was demonstrated 
that RS decreases systematically with in- 
creases in windchill. 

Equation 1 provides a first working for- 
mula for application to the design of equip- 
ment and clothing and to the use of men for 
cold-weather conditions. Although it is lim- 
ited to unacclimatized, selected men, and un- 
doubtedly subject to variation with changes 
in clothing, shelter, and the physiological con- 
ditions of individuals, the results suggest that 
the RS function is not importantly based 
upon physiological changes of the individual 
with cold exposure. At least, the lack of cor- 
relation of skin temperatures and, by infer- 
ence, rectal temperatures (1), with observed 
RS differences suggests that the function ob- 
tained was due to other than body heat 
losses. 

One plausible explanation of the results 
may be called the distraction hypothesis: 
This hypothesis assumes that other aspects 
of the environment (wind-produced noise; 
discomfort, and the perceived threat of col 
exposure) provide competing stimuli which 
interfere with the response elicited by the re- 
action signal and thus produce increased Ja- 
tencies. The presence of such competing 
stimuli should be most critical during thé 
foreperiod of reaction, and, therefore, relat- 
able in a measurable way to the presenc? 
of nonoptimum preparatory muscular phe 
nomena (2, 8), 

A distraction hypothesis has interesting im- 
plications. The elicitation strength of dis 
tracting environmental stimuli should depe” 
on their intensity, frequency and duration ° 
Previous occurrences. conditions of reinforce 
ment during these occurrences and the an* 
iety level of the individual; in short, on co? 
ditions of learning, This hypothesis also su 


eS zy 


eve 


Reaction Time in the Cold 59 


gests that so-called acclimatized individuals, 
short of marked physiological changes, may 
be individuals who are habituated in a psy- 
chological sense rather than acclimated in a 
physiological sense. Thus, it may be possible 
to speak not only of a physiological cold tol- 
erance, a term which refers to the resistance 
of the individual to the cooling power of the 
environment, but also to a psychological cold 
tolerance and mean by this resistance of the 
individual to the distracting power of the en- 
vironment. The former presumably depends 
upon physiological (circulatory, thalmic, etc.) 
and morphological (body fat, surface area 
and configuration) conditions and character- 
istics of the individual. The latter presum- 
ably depends upon the state of habituation 
of the individual and his anxiety level. 


Summary 


Visual RT’s were elicited from 620 soldiers 
Sorted into 14 different groups representing a 
Variety of ambient temperatures, windspeeds 
and windchills. Included were two groups at 
60° F., five mph, one of which was nude and 
the other lightly clothed. RT was measured 
after 45 min. of exposure and again following 
a short, mild exercise, after 65 min. of ex- 
Posure. In addition, mean area-weighted skin 
temperatures were obtained. The following 
Conclusions drawn from the results apply to 
the effects of the cold on “non-acclimatized” 
and/or “non-habituated” men, not in physio- 
logical distress: 

1. At low windspeed, at least up to five 
mph, low ambient temperature has no effect 
on RS, at least down to — 35° F. and prob- 


ably down to — 50° F. 


2. At windspeeds of 10 mph and greater, 
low ambient temperature produces a signifi- 
cant decrease in RS. 

3. Windspeed produces a significant de- 
crease in RS. 

4. Mild exercise produces a small recovery 
in RS. 

5. If men of low “physiological cold toler- 
ance” are removed from the more severe en- 
vironmental conditions and if Ss wear protec- 
tive clothing, RS is essentially a linear de- 
creasing function of windchill. 

6. It was hypothesized that the RS func- 
tion obtained is psychological in nature; a 
specific hypothesis of “psychological cold tol- 
erance” was proposed. 


Received May 2, 1957. 


References 


1. Burton, A. C., Snyder, R. A, & Leach, W. G. 
Damp cold vs. dry cold. Specific effects of 
humidity and heat exchange of unclothed man. 
J. appl. Physiol., 1955, 8, 269-278. 

2. Davis, R. C. Set and muscular tension. 
Univ. Publ. Sci. Ser. No. 10, 1940. 

3. Forlano, G., Barmack, J. E., & Coakley, J. D. 
The effect of ambient and body temperatures 
upon reaction time. ONR, SDC, Rpt. 151-1- 
13, 1948. 

4. Rao, C. R. Advanced statistical methods in bio- 
metric research. New York: John Wiley, 
1952. 

5. Siple, P. A., & Passel, F. Dry atmospheric cool- 
ing in subfreezing temperatures. Proc. Amer. 
phil. Soc., 1945, 89, 177-199. 

6. Table of wind chill values. QM Res. and Dev. 
Center, Climatic Research Lab., 1943. 

7. Teichner, W. H. Recent studies of simple reac- 
tion time. Psychol. Bull., 1954, 51, 128-149. 

8. Teichner, W. H. Effects of foreperiod, induced 
muscular tension and stimulus regularity on 
simple reaction time. J. exp. Psychol, 1957, 
53, 277-284. 


Indiana 


Journal of Applied 


Psychology 
Vol. 42, No. 1, 1958 


Weighted Application Blank Analysis of “Contingency” 
Items 


Thomas A. Mahoney 


Industrial Relations Center, University of Minnesota 


A method of analysis which has become in- 
creasingly common in prediction studies is 
the “weighted application blank analysis.” 
This method of analysis is a fairly simple and 
standardized method for development of pre- 
dictors from personal history or application 
blank information, and for development of 
weights for test scores in prediction. Despite 
the frequent use of this method in prediction 
studies, however, no attention has been given 
to the analysis of “contingency” items or in- 
formation, items where the answer is con- 
tingent upon answers to one or more previous 
questions. This note concerns the analysis 
and weighting of contingency items in the 
weighted application blank analysis. 

The weighted application blank analysis 
begins with identification of criterion groups: 
Group A composed of individuals classed ac- 
ceptable by the criterion, and Group B com- 
posed of individuals classed unacceptable. 
Possible responses to each item of the appli- 
cation blank are categorized for the tabula- 
tion of actual responses. Responses of Group 
A and Group B are tabulated separately for 
the entire list of questions or items. Provi- 
sion is frequently made for individuals who 
fail to respond to a particular item with the 
inclusion of a “no response” category. In 
this manner, a response of some sort can be 
tabulated for each individual in the two cri- 
terion groups. The next step involves calcu- 
lation of the percentage of each group which 
responded in each answer category for each 
of the questions. The same base, total indi- 
viduals in the group, is used in the calcula- 
tion of percentages responding in each answer 
category. The percentage of Group B indi- 
viduals is then subtracted from the percent- 
age of Group A individuals responding in each 
answer category. These percentage differ- 
ences are then transformed into weights for 
a scoring device (1, p. 225, tables). 


60 


Difficulty may arise in the fact that an- 
swers to one or more questions in the appli- 
cation blank are contingent upon answers to 
a previous question. For example, responses 
to a question asking for “number of children 
will be contingent in part upon answers to 2 
previous question concerning marital status. 
Analysis of these questions as separate and 
independent questions can result in the asi 
signment of unwarranted weights to certain 
responses of the contingent questions. A spe- 
cific example of this problem is considered 
below: 


Assume that one question or item concerns . 


marital status. Possible responses are cate- 
gorized as “single,” “married,” “other,” and 
“no response.” A second question contingent 
upon the previous one concerns number a 
children. Response categories are: “none, 
“1-2,” “3-4,” “5 or more,” and “no To 
sponse.” Responses to these two questions 
might be distributed as indicated in Table $ 
with resulting weights calculated as also i" 
dicated in Table 1. Note that those wh? 
responded “single” on the item concerning 
marital status are included in the oe 
response for number of children along wid 
those married individuals who have no chik 
dren. Since the response “single” is assign¢ 
a negative weight, the response “none” i 
number of children is also assigned a negativ" 
weight due to the influence of those individu 
als who are single. An entirely differe” 
weighting of responses to number of childre? 
might have resulted if the single individuals; 
those who “couldn’t respond” to number ° 
children, were not considered in the assig™ 
ment of weights for the number of childre” 
question. At the same time, the respons? 
“single” receives a negative weight twice in 
the example in Table 1—once as a respons 
to the marital status item, and once as a !& 
Sponse to the number of children item. 


. 


ell CT 
———_ o 


Weighted Application Blank Analysis of “Contingency” Items 61 


Table 1 


Assignment of Item Weights 


Number Percentage 
Net 
Group A Group B GroupA Group B A-B Weight 
Marital Status 
Single 21 47 2% 47% —26% E 
Married 63 47 63% 47% 16% 4 
Other 16 4 16% 4% 12% 4 
No response 0 2 0% 2% — 2% -2 
100 100 100% 100% 
Number Children 
None 30 52 30% 52% —22% -5 
1-2 32 24 32% 24% 87 2 
3-4 24 14 24% 14% 10% 2 
5 or more 14 10 14% 10% 4% 1 
No response 0 0 0% 0% 0% 0 
100 100 100% 100% 


minated through 


This difficulty can be eli 
ring systems for 


development of separate sco f 
the contingent questions, questions to which 
answers may be contingent upon answers to 
Previous questions. In the example presented 
here, an additional response, “can’t respond, 
might be assigned the number of children 
question. This response would include the 
single persons who. couldn’t legitimately indi- 
cate any children. These individuals would 
Not be considered then in the development of 


Scoring weights for the number of children 
T dicates the method for 


development of a separate scoring system for 
answers to this contingent question. The 
“can’t respond” responses are not included in 
the calculation of percentages or in the as- 
signment of weights to various responses. A 
zero weight is assigned to the “can’t respond” 
group having a neutral effect—these persons 
are not penalized or favored again for their 
single status. As indicated in Table 2, sepa- 
ration of the “can’t respond” group from the 
“none” response changes the weights assigned 
the “none” response. Those married persons 
with no children should not be assigned a 


Question, Table 2 in 
Table 2 
Revised “Contingency Item” Weights 
nS 
Number Percentage K 
et 
Aonar iy Group A Group B GroupA Group B %A—%B Weight 
a 1 47 0 
a respond se ; 1% z i 0 
ia 32 24 41% 45% Aa 1 
3-4 24 14 30% 26% 4% 1 
Šorm 14 10 18% 19% -1% 0 
ore 0 a 
No response 0 0 0% % A 0 
joo 100 100% 100% 


62 Thomas A, 
negative weight for their lack of children as 
would have happened in the example of 
Table 1. 


The point raised in this note may or may 
not be of much practical importance depend- 
ing on the particular items included in the 
application blank study. A study with few 
or no contingent items would not benefit 
much from the refinement of the method of 
analysis. Refinement of the method to ac- 
count for contingent items in one particular 
study, however, did improve the predictive 
ability of the scoring system. This study is 
one of several studies of personal history pre- 
dictors of management potential conducted 
within the Management Development Labo- 
ratory of the University of Minnesota Indus- 
trial Relations Center. The particular con- 
tingent items covered in this study are: 


Education 


High school organizations 

High school officerships 

High school letters 

Hours worked in high 
school 


College organizations 

College officerships 

College letters 

Hours worked in 
college 


Marital status 


Number of children 
Wife’s education 
Wife’s occupation 


The scoring system 


and weights developed 
without reference to 


the fact that responses 


Mahoney 


to certain items were contingent upon re- 
sponses to other items resulted in distribu- 
tions of high and low criterion groups and a 
cutting score which was exceeded by 84% of 
the high criterion group and by 36% of the 
low criterion group. Revision of the weights 
and scoring system to account for the con- 
tingent responses resulted in 87% of the high 
criterion group exceeding the cutting score 
and 35% of the low criterion group exceeding 
the cutting score. The revision to account 
for contingent responses did improve slightly 
the predictive ability of the scoring system. 
The extent of improvement cannot be as- 
sessed accurately without knowledge of the 
number of candidates accepted over a given 
time period. 

The point raised in this note is that an im- 
proved and more efficient method for de- 
veloping a weighted application blank is pro- 
vided through special handling of contingent 
responses. No additional effort is called for 
in this refinement, and the possible improve- 
ment of prediction suggests the value of this 
scoring of contingent responses. 


Received May 3, 1957. 


Reference 


1. Stead, W. H, & Shartle, C. L, Occupational 
Counseling Techniques. New York: Ameri- 
can Book Co., 1940, 


Journal of Appli 
Applied Psychol 
V 1958 ychology 


Jol. 42, No.1, 


Sensitization Versus Adaptation in Preparation for Emer- 
gencies: Prior Experience with an Emergency Ration 
and its Acceptability in a Simulated Survival 
Situation * 


E. Paul Torrance 


Survival Methods Branch, Air Force Personnel and Training Research Center 


na itary training programs, child training 
kalie. and educational programs involving 
eing IC stresses often have been attacked as 
ee oe damaging than beneficial. De- 
Betiene: of such programs argue that ex- 
Sittiation fear-evoking stimuli in simulated 
Minore results in greater adaptation by 
akere the fear of the unknown.” The at- 
stmuli retort that facing these fear-evoking 
with ae replaces a “fear of the unknown 
sensiti fear of the known” and results in 
Sokin coe or a reinforcement of the fear- 
& stimuli. 

g he author and his colleagues are conduct- 
issue Ds of field studies concerning this 
Situation the realistically simulated survival 
Choo] E the USAF Survival Training 
ing Dito ne controversial issue in this train- 
bar fa gram has concerned the use of a meat 

ere iny known as “pemmican,” as the 
Survival a, ration in a seven-day simulated 
this rati evasion, and escape exercise. Though 
Plorers a is highly favored by polar ex- 

oa ‘Unters and trappers, and others (8), 
tation €ptability has been rather poor im 
a trials conducted by the United States 
duet adian Armies (2). In tests con- 
about q a the Aero Medical Laboratory (6), 
he] Yo of the subjects (aircrew person- 
ergoing survival training) reported 


the ration “made them sick.” 


report is based on work done under ARDC 
9. 7723, Task No. 77461, in support of the 

nner development program of the Air Force 
p Orce and Training Research Center, Lackland 
ducti ase, Texas. Permission is granted for re- 

imon, translation, publication, use, and dis- 
Dri owhole and in part by or for the United 
TSS op rament, The opinions or conclusions €x- 
th z Need implied herein are those of the author. 
i views not be construed as necessarily reflecting 

or endorsement of the Department of the 


Or 
mang,” of the Air Research and Development 


' 
j 


a 


63 


In the light of the negative effects revealed 
by the Army and Air Force studies, its actual 
use in a training situation has at times been 
questioned even by those who believe that it 
is the best emergency ration now available. 
They fear that use of the ration in training 
might actually deter individuals from eating 
it in an actual emergency. They maintain 
that an individual in an actual emergency is 
more likely to eat the ration if he has never 
tried it than if he has tried it and disliked it. 
Their argument is strengthened by the high 
probability that an individual trying the ra- 
tion will dislike it. A study by Mason * of 
the psychological and training factors affect- 
ing acceptability of this ration, however, sug- 
gested that prior experience with the ration 
is related to higher acceptability. His study 
did not differentiate those who tried the ra- 
tion and liked it from those who had tried it 
and disliked it. Thus, the present study was 
designed to provide more definite informa- 
tion concerning prior experience and reac- 


tions to the ration. 


Procedures 


Subjects 

The Ss of the study were 416 aircrewmen under- 
going survival training and may be regarded as nor- 
mal American males, ranging in age from 20 to 40. 
Each S was issued eight meat bars (pemmican) at 
the beginning of a seven-day simulated survival, 
and escape exercise. This was supplemented 
ds of beef and a small quantity 
packets of chili and onion 
powder, 16 cubes of sugar, and eight packets each 
of soluble coffee and tea. Since the training took 
place during the summer in the Plumas National 
Forest, supplementary plant and animal foods were 


evasion, 
by about two poun 
of vegetables, small 


available. 
2Mason, R. A survey of the psychological and 
training factors related to survival ration accept- 


ability. Unpublished manuscript. 


64 


Collection of Data 


Following the seven-day exercise, Ss were adminis- 
tered a questionnaire to obtain measures of accept- 
ability and other information concerning the field 
experience. Acceptability items included: i : 

1. The traditional hedonic scale (7-point in this 
study), requiring the S to indicate his reaction 
(ranging from like extremely to dislike extremely) 
to each of the following five common methods of 
preparing the ration: cold, heated with water only, 
heated with water and chili powder, heated with 
water and onion powder, and cooked in a stew with 
plant and/or animal foods. 

2. The number of meat bars eaten, 

3. Reasons for not eating all of the bars issued 
(if applicable). These included: part lost by acci- 
dent, made me sick, made me thirsty, tasted bad, 
smelled bad, too hard or dr: » and too greasy. 

4. Conditions under which the S would eat the 
ration in the future (whenever hungry, only when 
very hungry, and not even if very hungry). 


Analysis of Data 


Ratings on the hedonic scale were weighted from 
“1” (like extremely) to “7” (dislike extremely) 
and summed for the five methods of preparation. 
If an individual indicated that he had not tried the 
ration prepared according to one of the methods, 
this method was assigned the mean rating of the 
methods which had been tried. The number of bars 
eaten was used instead of the number of bars un- 
eaten, as used in Previous studies (11, 12), since this 
index makes fuller use of the data. Some Ss bar- 
tered bars from fellow crewmen and members of 
other crews, Reported consumption ranged from 
one-sixteenth of a bar to 25 bars. The only reason 
for not eating 
Studied was “made me Sick,” 


» as does actual consumption. 
expressed willingness to “eat it whenever hungry” 
was studied as the critical Tesponse in the area of 
probability of future use, 

The inexperienced group was first compared with 
the experienced group (those who had tried the 
ration and liked it and those who had tried it and 


Table 1 


Comparison of “No Prior Experience” 


Group versus “Definite Prior Ex; 
Four Measures of Acceptability 


E. Paul Torrance 


disliked it) on each of the four criteria. The means 
of the first two criteria were compared by means of 
critical ratios and the number “made sick” and the 
number “willing to eat the ration whenever hungry 
were compared by chi squares computed according 
to the method described by McNemar (5, pp. 224- 
226). To study further the possible effects of the 
three conditions of prior experience, data on each of 
the four criteria were summarized for each type of 
experience. Data concerning self-evaluated changes 
in attitude were also summarized for cach of the 
three types of prior experience. 


Results 


The mean rejection score on the hedonic 
scale, mean number of bars eaten, percentage 
“made sick,” and percentage who would eat 
the ration in the future for the “no experi- 
ence” and “definite experience” groups aa 
presented in Table 1 along with appropriate — 
tests of significance. It will be noted iy 
those with prior experience (regardless P 
whether they liked the ration or not) eX 
pressed greater liking, ate more bars, less iA 
quently reported having been made sick, an 
more frequently expressed intentions of eat- 
ing the ration in the future whenever hungry 
than the inexperienced group. All difference? | 
are significant at better than the 5% leveri 
Conclusions concerning the over-all effects K 
prior experience are strengthened by the E 
that the present sample is heavily loaded m 
Ss who had previously reacted unfavorably 
(74 versus 33). 

Means and standard deviations of scores 0” 
the hedonic scale and number of bars eal 
are shown in Table 2, Requirements fo 
homogenity of variance are satisfied in in 
case of scores on the hedonic scale but not » 
the case of number of bars eaten, accordin’ 
to Bartlett’s Test. It is interesting to n° 


perience” Group on 


y of Meat Bar 
No Prior 
a Experience Experience Significance of 
Measure (N = 287) (N = 107) Difference 
Mean rejection score on hedonic scale 21.58 
i 19.9 = 2.11) 
Mean number bars eaten 6.22 7 a Scar = 18) 
Percentage “made sick” 24.39 11.21 aa ae 21) 
Percentage “eat in future whenever hungry” 34.49 52.34 Bee ne = 6.60) 
. .02 (x2 = 6. 


| 


Sensitization vs. Adaptation in Preparation for Emergencies 


65 


Table 2 


Means and Standard Deviations of Scores on Hedonic Scale and Number of Bars of Meat 
Eaten for Each of Four Conditions of Prior Experience 


7 Mean Mean Bars 

Condition Number Hedonic SD* Eaten SD» 
No previous experience 287 21.58 7.10 6.22 2.97 
Had just tasted 22 21.20 7.12 7.23 3.09 
Used and liked 3 14.91 5.03 8.54 3.46 
Used and disliked 74 22,79 5.45 6.41 4.48 

* Require ity of variance satisfied. 

> ements for homogeneity of variance S cred. Using Bartlett's Test, chi square = 9.24, p < 02. 


quirements for homogeneity of variance not 


that those with less prior experience tend to 
© more variable in their verbalized attitudes 
Pine rien scale ratings), whereas those with 
v: st previous experience tend to be more 
pn able on the actual consumption criterion. 
tise variability is particularly marked in the 
ase of those who have used the ration and 
disliked it, 
Spa analysis of variance was made for scores 
om 3 hedonic scale, since requirements for 
le ne of variance were met.. The re- 
Varia Shown in Table 3, indicate significant 
ve due to the conditions of previous 
Use Direct tests were then made by 
Dated the critical ratio. As might be antici- 
like’; those who had used the ration and 
an it expressed more favorable attitudes 
than the other groups (significant at better 
ow the 1% level). The important finding, 
cea is that those who had used the ra- 
ah hae disliked it did not express signifi- 
o y different attitudes from those who had 
ae experience (CR = 1.55, not sig- 
vas though requirements for homogeneity of 
iance are not met in the case of number 
Meat bars eaten and an analysis of vari- 


Table 3 


ance cannot be run justifiably, it is at least 
interesting to note that the mean number of 
bars eaten by those who had used the ration 
and disliked it is slightly greater (though cer- 
tainly not significantly so) than the number 
eaten by those with no previous experience. 

Table 4 presents the percentages reporting 
having been “made sick” and the percent- 
ages who say that they will eat the ration 
whenever they are hungry for the conditions 
of experience. For the purposes of this study, 
the most important fact revealed by Table 4 
is that proportionately fewer of those who 
have used the ration and disliked it report 
having been “made sick” than those with no 
previous experience (chi square 5.15, p< 
.05). Also, it is important to note that pro- 
portionately as many of those who had previ- 
ously disliked the ration state that they will 
eat it whenever hungry as of those who re- 
ported no previous experience in using the 
ration. 

Data concerning reported changes in reac- 
tion for the three experienced groups are 


Table 4 


Percentages “Made Sick” and “Willing to Eat Ration 
Whenever Hungry” for Each of Four Condi- 
tions of Experience with the Ration 


A 3 
aly; ae of Variance Table for Scores on Hedonic Scale Percentages 
or Four Conditions of Previous Experience ate a 
Source SE a Made Whenever 
ariati Sum of Mean 3 Condition Number Sick Hungry 
on Squares df Square Ey F 
Bete, No previous experience 287 24.39 34.49 
Wine” 1499.29 3 -499.76 11.02 Had just tasted 22 13.66 50.00 
18,506.21 408 4536 (P< 001) Used and liked 33 9.09 87.88 
Total ce Tees Used and disliked 74 12.16 36.49 
,005. 


66 E. Paul Torrance 


Table 5 


Change in Reaction to Meat Bar Reported by Ss with Three Types of Prior Experience 


Liked About Same 


Liked Better Liked Worse 


Type of Experience Number 


Percentage Number Percentage Number Percentage 
Had just tasted 7 31.82 11 50.00 4 ses 
Had eaten and liked 13 40.62 13 40.62 6 ae 
Had eaten and disliked 39 54.17 23 31.94 10 8 
Total 59 46.82 47 37.30 20 1588 


Note.—One in the second category and two in the third category above did not respond to this item, 


shown in Table 5. It is interesting to note 
that about 32% of those who had previously 
eaten and disliked the meat bar liked it better 
“this time.” 

Since other studies (11, 12) have shown 
that a number of psychological, social, and 
training factors are related to acceptability 
of this ration, the experienced group and in- 
experienced group were compared on the fol- 
lowing variables: success in obtaining supple- 
mentary food, effort to obtain supplementary 
food, perceived attitude of instructor, per- 
ceived effort of instructor to influence ac- 
ceptability, and perceived reaction of own 
crew, In every case the distribution of re- 
sponses is so nearly identical for the two 
groups that calculation of tests of significance 
of the difference is unnecessary, 

Although results have been shown in 
Tables 2, 4 and 5 for those who had “just 
tasted the ration,” this group has been elimi- 
nated from the analyses reported herein be- 
cause of the small number and the uncer- 
tainty concerning the natur 
ence. The experience of 
ration” in no case negatively affects reactions 
and is accompanied by a consistent, though 
not always significant, positive reaction. 


e of this experi- 
“just tasting the 


Discussion 


The results of this study may be inter- 
preted as supporting realisti 
preparation for successful adap 
gencies, at least in the area 
trination. Those who had Previously used 
pemmican, whether they had liked it or not, 
in comparison with those who had had no 
Previous experience with it expressed more 
favorable reactions on the hedonic scale, ate 
a larger number of bars, less frequently re- 


c training as 
tation in emer- 
of food indoc- 


ported having been “made sick,” and ex- 
pressed a more favorable attitude toward its 
future use. Even those who had tried the 
ration and disliked it reacted as favorably as 
those who had had no experience whatsoeve! 
in using it. If reports of having been “made 
sick” are considered, those who had tried the 
ration and disliked it reacted even more fa- 
vorably than those who had not tried it at all 

Although related studies have not attacked 
in a direct manner the basic problem posed 
in this paper, such studies suggest that the 
phenomena found in this study may be ex- 
pected to be found in other areas of human 
behavior. Hudson (1), for example, has sum- 
marized a number of laboratory and fiel 
studies concerned with anxiety in response tO 
the unfamiliar. Hudson maintains that ten 
ing for meeting emergencies is valuable, 2° 
because it develops the correct behavior pat- 
terns per se but because it provides some 
stability in an otherwise perceptually unstruc- 
tured situation and thereby reduces anxiety. 
Schwartz and Winograd (7) in their studie 
of troop participation in atomic maneuver? 
found that realistic information gained abou 
atomic effects are related to changes in atti- 
tudes of confidence or anxiety toward partici 
pation in atomic maneuvers or warfare. TO! 
Fance (10) has also shown that possession ° 
information about how to survive in extreme 
Conditions and gains in such information até 
related to expressed confidence in ability t° 
survive such situations. 

A study reported by Taylor, Brozek, He? 
schel, Mickelson, and Keys (9) provides po 
sibly the strongest support of the findings ° 

Sent study. These experimenters ca" 
t metabolic, Physiological and psycho- 
measurements on four men who pe! 


ried ou 
motor 


formed hard work under rigidly controlled 
Conditions during five successive two-and-one- 
‘half-day fasts. The successive fasts were 
separated by five- to six-week intervals. Re- 
Sults of the first and fifth fasts were com- 
Pared. During the second and third days of 
the fasting, all Ss maintained the blood sugar 
at a significantly higher level in the fifth as 
Compared to the first fast. Motor speed and 
Coordination, reaction time, and pattern trac- 
ing were also superior during the fifth fast 
When compared to the first. 
In both the study of successive fasting and 
fhe present study, the Ss apparently replaced 
fears of unknowns” with realistic knowledge 
obtained through actual experience. No doubt 
ee undergoing successive fasts were ini- 
lally anxious concerning what might pos- 
Sibly happen to them as a result of fasting. 
After they discovered that they suffered no 
Serious ill-effects, their entire systems reacted 
more favorably during later fasts. Pemmican 
also involves something of the unfamiliar. It 
Hee somewhat strange looking little bar of 
act beef and pork mixed with suet. Cer- 
ainly the Ss are not accustomed to eating 
eir meat in this form. Using it removes 
ed Strangeness and results in more favorable 
3 actions. Basically the same process 15 m- 
olved in role playing and psychodrama in 
Preparing children to meet new experiences 
( » Preparing individuals for leadership roles 
to 2 2nd preparing hospital patients to adapt 
outside life. 
gen tnally, it should be cautioned that though 
iy improved reactions may be expected 
ihe realistic experiences such as described in 
v S, Some negative reactions can be expected. 
or example, it will be recalled that those who 
= initially disliked pemmican were more 
atiable in their consumption of the ration 
an those who had had no previous experi- 
nee with it. 


Summary 

i The issue of sensitization versus adaptation 
i, Preparation for emergencies was studied 
and. Specific field situation. Four hundred 
re uXteen_ normal adult males undergoing 
rg tically simulated survival experience 
* Issued eight meat bars (pemmican) as 
oe of their emergency ration for the 
€n-day exercise. Ratings for five methods 
Preparation, number of bars consumed, re- 


Se 


Sensitization vs. Adaptation in Preparation for Emergencies 67 


ports of having been “made sick,” and atti- 
tude toward future use were used as criteria 
of the Ss’ acceptance of the ration. 

The Ss who had previously used the ration, 
regardless of whether they liked or disliked 
it, responded more favorably according to all 
four criteria when compared with those who 
had had no experience with the ration. Even 
those who had tried the ration and disliked it 
responded as favorably as those who had not 
tried it. Fewer of those who had disliked the 
ration reported having been “made sick” by 
the ration than those who had never tried it. 

The results have been interpreted as sup- 
porting arguments in favor of realistically 
simulated training as preparation for adapta- 
tion in emergencies. 


Received May 6, 1957. 


References 


1. Hudson, B. B. Anxiety in response to the un- 
familiar. J. soc. Issues, 1954, 10, 53-60. 

. Johnson, R. E., & Kark, R. M. Feeding prob- 
lems as related to environment. An analysis 
oj United States and Canadian Army ration 
trials and surveys, 1941-1946. Chicago: Quar- 
termaster Food and Container Institute for 
the Armed Forces, 1946. 

3. Klein, A. F. Role playing in leadership training 
and group problem solving. New York: As- 
sociation Press, 1956. 

4. Lippitt, Rosemary. Psychodrama in the home. 
Sociatry, 1947, 1, 148-167. 

. McNemar, Q. Psychological statistics (2nd ed.). 
New York: Wiley, 1955 

6. Pippitt, R. G. Ration, special survival, RS-1, 
field test of component acceptability. Wright- 
Patterson Air Force Base, Ohio: Wright Air 
Development Center, 1956. (WADC Tech- 
nical Note 56-216.) 

7. Schwartz, S., & Winograd, B. Preparation of 
soldiers for atomic maneuvers. J. soc. Issues, 
1954, 10, 42-52. 

8. Stefansson, V. Not by bread alone. New York: 
Macmillan, 1946. 

9. Taylor, H. L., Brozek, J., Henschel, A., Mickel- 
sen, O., & Keys, A. The effect of successive 
fasts on the ability of men to withstand fast- 
ing during hard work. Amer. J. Physiol, 
1945, 143, 148-155. 

10. Torrance, E. P. The relationship of attitudes 
and changes in attitudes toward survival ade- 
quacy to the achievement of survival knowl- 
edge. J. soc. Psychol., 1954, 40, 259-265. 

11. Torrance, E. P., & Mason, R. The indigenous 
leader in changing attitudes and behavior, 
Int. J. Sociometry, 1956, 1, 23-28. 

12. Torrance, E. P., & Mason, R. Psychological and 
sociological aspects of survival ration accept- 
ability. J. clin. Nutr, 1957, 5, 176-179, 


we 


on 


CONTEMPORARY PSYCHOLOGY 


A Journal of 


Reviews 
Criticism 
Opinion 


No time to read? 


Let CP help with... 


Selective reviews of the latest books by specialists 
in the particular field involved. 


Comment by the Editor on news from the publishing 
world, on the printed word in particular and in 
general, on criticism, reviewing, and opinion. 


Feedback on controversial book reviews in a Letters- 
to-the-Editor section. 


Films, reviewed and listed. 


Lists of the latest books received. 


Put CP in your brief case and read it on planes, trains, 
buses. 


Keep in touch with the latest developments in your 
field of interest. 


Subscription, $8.00 


Single copy» 
(Foreign, $8.50) 


$1.00 
Send subscription orders to: 


AMERICAN PSYCHOLOGICAL ASSOCIATION 
Publications Office 
1333 Sixteenth Street, N. W. 
Washington 6, D. C. 


VoL. 42, No. 2 


_ Journal of Appl 


i 


APRIL, 1958 


The Contribution of Interview and Situation 


al Performance 


Procedures to the Selection of Supervisory Personnel * 


Robert Glaser, Pau 
University of Pittsburgh and th 


Sie article presents the results of a study 
tion oye with the construction and valida- 
proced interview and situational performance 
Pee for the selection of supervisory 
to det nel. The study was designed primarily 
Dito the unique contribution of such 
Beng: ures when the effects of paper and 

cil tests and other identifiable predictor 


waite 
ariables are controlled. 


Method 


aen of the study. Two groups, of 
Visors oer were selected from 227 civilian super- 
Broups mployed at two large military depots. These 
en S selected so that they would have the 
Much ae characteristics: (a) they would differ as 
effective Possible on criterion scores of supervisory 
Possible os and (b) they would be as similar as 
Predictj on a set of control variables known to be 

$ XS of supervisory performance. 
forman experimental interview and situational per- 
Stroup, ce tests were then administered to these 

S. Because the groups were matched on known 


Predj on k 
io ra variables and differed on the criterion 0 
tained oo” it was expected that any differences ob- 

ents would re- 


fect ae the experimental instrum 
duras a y the unique contributions of these pro- 
ind i criterion 
Ndividu l gh and low 


f vi ps, of 40 super- 


s the identification of hi 
s. 


Yilerion ; 
n in 
ed h struments. 


lect Criterion data were col- 


Dreferre p Petvisor Performance Re 
Sonne] Ree rating form develope 
oe Branch, which is mad 


A = 
pProject sponsored by the Personn 


Bra, 

Ne) 

of the jee the Adjutant General’s Office, Department 
a rmy. Portions of this article were. 


ji a p presente 
olo; Paper at the meetings of the American Psy- 
be Association in Chicago during September 
M2 

Buper 

able bee Selection Battery and was 

Ween. Department of the Army 


el Research 


my Civilian 
made avail- 
for this re- 


ae 
1s form is part of the present Ar 


1 A. Schwarz, and John C. Flanagan 


e American Institute for Research 


of statements descriptive of job behavior. On each 
the evaluator selects one statement “most de- 
s job performance and one state- 
This report was com- 
familiar with the 


item, 
scriptive” of the S’ 
ment “least descriptive” of it. 
pleted by several raters who were 
S’s day-to-day performance. 

2. Ratings of Supervisor Effectiveness. This is a 
set of three rating scales concerned with different 
aspects of supervisory ability. On Scales 1 and 2, 
the evaluator compares the subject to descriptions of 
four sample supervisors representing four degrees of 
effectiveness. On Scale 3, a supervisor is compared 
to the “ten best” and “ten poorest” supervisors the 
rater has observed in the course of his job experi- 
ence. This rating form was completed by raters 
that had only a general knowledge of the S’s per- 
formance and reputation, and also by his immediate 


supervisor. 
3. The Pi 
by-day record 


erformance Record (2). This is a day- 
form of specific actions descriptive 
of effective and ineffective supervisor performance. 
This daily record was maintained over a three-month 
period by each S’s immediate superior. 

A single index of supervisor effectiveness was com- 
puted for each S by combining the scores obtained 
on these criterion instruments. 

Matching (control) variables. In obtaining con- 
trol data for the matching of High and Low cri- 
terion groups, an attempt was made to consider as 
many as possible of the variables known or likely 
to be related to supervisory effectiveness. These in- 
cluded the following: 

Score on a test of basic ability. This is a paper 
and pencil test that is one of the components of the 
present supervisor selection battery developed by the 
Personnel Research Branch. It consists of items on 
verbal meaning, numerical facility, and spatial visu- 


alization. 


Score on a test of supervisory practices. This is 


another paper and pencil component of the present 
battery. It requires judgments of appropriate ac- 
tion in hypothetical problem situations similar to 
those a supervisor would face on the job. Four or 
five alternate ways of dealing with each situation 
are presented, and the supervisor is asked to choose 
e worst and one best solution. 


the on 
gy, Research 


iO COLEESE™ 


urea Ednl. 
neg’ 2 IN 


70 


Table 1 
Criterion and Control Scores for High and Low Groups 


Depot A Depot B 
Mean SD Mean SD 

Composite Criterion 

High Group 155 11.6 144 16.5 

Low Group 50 24.6 52 28.8 
Basic Ability 

High Group 30 7.7 29 6.6 

Low Group 29 6.8 29 5.3 
Supervisory Practices 

High Group - 18 4.2 18 4.0 

Low Group 17 4.2 18 44 
Age 

High Group 40 8.1 41 10.1 

Low Group 40 7.6 4&3 94 
Years as Supervisor 

High Group 5.3 2.9 3: 33 

Low Group 60 22 8 3.7 
Job Grade 

High Group 6.6 2.9 4.3 1.8 

Low Group 5.4 2.8 4.4 2.6 

Age. 

Number of 


years as a supervisor, 
Present job grade or level, 

Sex and race were al 
distributio 


variances 
and the means of the resulting samples would be 
comparable. The results of this Procedure are 
shown in Table 1, i 


Test development ar 
sis of detailed “test rationales” 
visor functions to be predicted 
instruments were constructed: 


nd administration, 


St scores was base 


s d on estimated 
scores. 


R. Glaser, P. A. Schwarz, and J. C. Flanagan 


A Standardized Panel Interview. This was con- 
ducted as an informal discussion between the candi- A 
date and a panel of three interviewers. Topics and 
probing questions related to supervisory performance 
and attitudes were introduced into the discussion ac- 
cording to a pre-arranged schedule. At the end of 
the interview, each interviewer independently com- 
pleted ratings on the candidate’s personal character- 
istics and attitudes, and on particular aspects of the 
candidate’s responses. k i 

A Standardized Individual Interview. This was 
administered and scored just like the panel inter- 
view, but was conducted by only one interviewer. 

A Group Discussion Problem, This was set up as 
a committee meeting of four candidates responsible 
for developing recommendations on a particular ra 
Pect of plant management. An observer on 
the performance of each candidate in terms of (a 
a checklist of specific discussion behaviors and (b) 
ratings based on the candidate's contributions aug 
ing the discussion. The task of the examiner va 
simplified here by the use of a “time-sampling a, 
cedure, in which the discussion was divided in 
discrete observation periods. Within each of a 
Periods only the Presence or absence (rather a 
the frequency of occurrence) of each checklist 
havior was recorded. 3 aa 

A Role-Playing Situation, Here the candidate v 
required to deal with a “staged” personnel proble: 
as he would deal with it in an actual job situa 
An assistant examiner Played the role of the su a 
ordinate involved in the problem and interacted 
with the candidate in a relatively standardized mee 
ner. The examiner recorded specific aspects of t z 
candidate’s performance on a checklist of effectiv' 
and ineffective behaviors. zd 
A Small-Job Management Problem. This involv p 
the utilization of personnel and materials in a e, 
ture work situation. The candidate was required 
train subordinates, organize the work flow, ee 
monitor job activities. He was scored by an od 
server both on a checklist of effective and netics 
tive supervisory actions and in terms of his actu 
work output. 


Pier to 
This experimental test battery was administered | 


the selected sample of 40 supervisors—20 High a” 
20 Low—at each 


80. Each candidat 
tests with the exc 
ment Problem for 
for administrative 
the Possibilities of 


depot, making a total sample tel 
e was tested on two forms of eo 
eption of the Small-Job Mant E 
which replication was not feasi te 
reasons. In order to climes 
bias resulting from prior Enoy 
edge of the candidates’ capabilities and reputations, 


each depot supplied the examiners for the testing ° 
the other installation, 


Analysis of Results 


mates of predictor variances com 4 


j 
N 


— TT "a 


Selection of Supervisory Personnel 71 


Puted directly from extreme groups would 
ea Overestimates, the computation of these 
a lances was carried out by the procedure 
Pe ended by Peters (3) and Peters and 
ie aed (4). This procedure was used 
the a computations requiring an estimate of 
na puller variance. Since criterion scores 
Ea ah obtained for the entire sample from 
ae the extreme groups were drawn, the 
alue of the criterion variance was computed 
directly, 

The analysis of results is concerned with 

ree aspects: test reliabilities, validities of 

€ single predictors, and validities of com- 
Posite predictors. 


Predictor Reliabilities 


contterate form reliability coefficients were 
3 puted for all predictors except the Small- 
wi Management Problem, only one form of 
ich had been administered. Table 2 shows 
a results. This table shows:the alternate 
th ™ reliabilities obtained at each depot and 
€n the average of the two. 
hese results indicate that the Group Dis- 


Table 2 
Alternate Form Reliability Coefficients 


Two- 
Depot Depot Depot 
A B Average* 
Standardized Panel Interview .77 AT -65 
Individual Interview 70.29 53 
Group Discussion Problem 66 -80 74 
Role-Playing Situation A9 18 34 


* Averaged by r-to-s transformation. 


cussion Problem in general is the most reli- 
able of the predictors. The reliability of the 
interviews is somewhat lower, with the Panel 
Interview consistently superior to the Indi- 
vidual Interview. The reliability of the Role- 
Playing Situation is the lowest throughout. 


Validity Estimates for Individual Predictors 


Two indices of validity were computed for 
each predictor instrument: a £ ratio, repre- 
senting the significance of the difference be- 
tween the mean scores of the High and Low 


Table 3 
Significance of High and Low Group Mean Differences (¢ ratios) on Individual Predictors 
ear =20) Oat Boo) Ee Seep 
Panel Interview 
Form I E $ 4 
Form II ve ; 12 
Form I + II 16 5 š 
ar Interview 17 —1.0 2 
Form II N 2 3 
Form I + II 2.0 —2 : 
Group Discussion Problem é ang 2 
Form I ao a 16 
Form I + II í 
Role-Playing Situation 7 =g 2 
Form I + II 1.2 1.0 1.6 
6 5 Š 


Small Job Management 


* 
Si 
4 S8nificant at approxi he 10% level. 
i proximately the "0, 
ex Significant at approximately the $% level. 


Bhificant at approximately the 1% 


72 


R. Glaser, P. A. Schwarz, and J. C. Flanagan 


Table 4 


Biserial r’s for Individual Predictors 


Depot A 
(Nu=N1=20) 


Depot B 


Two-Depot Average 
(Nu=NL=20) 


(r-to-z combination) 


Panel Interview 
Form I 


AT 00 09 

Form IT -23 04 = 

Form I + II pal 02 ` 
Individual Interview 

Form I :23 —.15 04 

Form II -26 ll 19 

Form I + II ar —.02 A3 
Group Discussion Problem 

Form I .09 —.05 02 

Form IT 39 ale .28 

Form I + IT 27 07 AZ 
Role-Playing Situation 

Form I 09 —.04 02 

Form II 20 25 -23 

Form I + II 17 14 16 
Small Job Management .09 .08 08 


criterion groups; and a biserial correlation 
coefficient, based on the estimates of the-pre- 
dictor variances obtained from the Peters and 
Van Voorhis procedure. The analysis of 
mean differences constitutes a test of the null 
hypothesis that the predictors fail to dis- 
criminate between the extreme groups while 
the validity Coefficients provide an estimate 
of the relationship between the predictors and 
the criterion over the total sample. 


* As an indication of the extent to which the bi- 
serial coefficients from widespread classes approxi- 
mated the coefficients that would have been obtained 
from total group data, both kinds of coefficients 
Were computed for the tests of Basic 


These results are summarized in Tables 4 
and 4, respectively. In reviewing these data, 
it should be remembered that: The level K 
prediction indicated by these analyses rep 
sents only that portion of an instrument’s he 
tal predictive power that is independent of t 
Basic Ability and Supervisory Practices pe 
and of the other matching variables; and a 
the range of talent in the present sample A 
restricted by the use of experjenced a 
visors, so that the obtained results are a 
than those likely to be found for a group 
typical candidates, 


Ability and i d 4 im 

Supervisory Practices, The results showed close _The results reported in Tables 3 oe ssio” 

agreement for both tests at both depots. dicate the following: The Group Discu 
Table 5 


Depot A 


Criterion vs. Basic Abilities 

Criterion ys. Supervisory Practices 

Criterion vs. (Basic Abilities + 
Supervisory Practices) 


e 
DepotB  Two-Depot Averag 
(N = 109) (N= 118) — (r-to-z combination) 
28 23 25 
.26 19 23 
.29 25 DY 


Ne 


vV 


Selection of Supervisory Personnel 73 


Table 6 


Multiple r’s for Present Battery Plus Composites of Experimental Predictors* 


Two-Depot Average 


Depot A Depot B (r-to-s combination) 

(BA + SP) + GD 38 .26 32 
(BA + SP) + PI 35 .25 30 
(BA + SP) + II 38 .25 32 
(BA + SP) + RP 32 27 30 
(BA + SP) + GD + RP 38 27 33: 
(BA + SP) + GD +I 42 25 34 
(BA + SP) + GD +II+PI Al 25 33 
(BA + SP) +GD+II+RP 40 26 33 

40 26 33 


(BA + SP) + GD + II + PI + RP 


A rca, ction P* 
zioblem shows up as perhaps the most prom- 
GE of the instruments; contrary to expec- 
lons, the Panel Interview shows no su- 
veces over the Individual Interview; the 
Hankel procedures and the Role-Playing 
Sin on are about equally successful; the 
e | Job Management Problem shows little 
eae for predictive effectiveness as it was 
Mistered in this study. 


Predictor Composites and Operational 
Batteries 


eg e predictive values of several combina- 
term of the test instruments are presented in 
fren, of the total validity of the experimental 
rai in combination with the Basic 
ER. and Supervisory Practices tests of the 
Would © battery; a combination of this type 
Pres be employed in actual practice. Table 5 
Sents the validities of the existing Basic 
lity and Supervisory Practices tests, singly 
able combination (equally weighted). In 
ent 6 these tests are combined with differ- 
all hoo of the predictors studied. | In 
equal Composites test components received 
ver weighting. Since the criterion groups 
ere matched on Basic Ability and Super- 


A . 
they Practices scores, the correlation of 
in ©, tests with the experimental predictors 

d multiple 


Corr ie case is essentially zero, an l 
asig ation coefficients were computed on this 


I . 
dicat Beneral, these findings of the study in- 
tiya © that some contribution to the predic- 


© value of the paper and pencil tests can 


Group Discussion Problem; PI = Panel Interview; II = Indi- 


be made by the addition of the predictors 
studied. The results are not differentiating 
enough between the experimental tests to sug- 
gest certain of the predictors to the exclusion 
of others. From a practical point of view, 
however, the similarity of the results ob- 
tained with the panel and individual inter- 
view procedures suggests that the more eco- 
nomical individual interview might be used 
where an interview is specifically desired. 
With respect to further development of these 
tests, the fact that the Role-Playing Situa- 
tion gave validities comparable to other pre- 
dictors despite its low reliability suggests that 
revision of the administrative and scoring 
procedures of this test may be fruitful. Fi- 
nally, it seems important in terms of testing 
efficiency that the Group Discussion Problem 
permitted comparable evaluation of four can- 
didates in the same amount of time in which 
one candidate could be evaluated in an inter- 


view. 
Received April 22, 1957. 
References 


1. Flanagan, J. C. The use of comprehensive ra- 
tionales in test development. Educ. psychol. 
Measmt, 1951, 11, 151-155. 

2. Flanagan, J. C., & Miller, R. B. The perform- 
ance record program. Chicago: Science Re- 
search Associates, 1955. 

3. Peters, C. C. A technique for correlating measur- 
able traits with freely observed social behav- 
iors. Psychometrika, 1941, 6, 209-219. 

4. Peters, C. C, & Van Voorhis, W. R. Statistical 
procedures and their mathematical bases. New 
York: McGraw-Hill, 1940. 


Journal of Applied 


Psychology 
Vol. 42, No. 2, 1958 


A Further Note on the Fakability of the MTAI* 


A. G. Sorenson and M. S. Sheldon 


School of Education, University of California, Los Angeles 


The problem of getting honest, objective, 
and straightforward answers to personality 
and interest inventories has been of concern 
to test users for some time. A related prob- 
lem and the concern of this paper, is how to 
find out whether or not a particular inven- 
tory, in this case the Minnesota Teacher At- 
titude Inventory, can be falsified. There are 
at least five published studies (1, 2, 4, 5, 6) 
of the fakability of the MTAI. Each of the 
investigators had his subjects complete the 
inventory twice under differing conditions. 
However each used somewhat different in- 
structions and report somewhat different find- 
ings. Since it appeared that some of the dis- 
Crepancies in findings might be due to the 
fact that each investigator was concerned 
with the effect of different conditions of ad- 
ministration, it was decided to conduct a 
study which incorporated all these conditions 
in a factorial design. The report which fol- 
lows will present the findings of that study 
and compare them with those of earlier stud- 
ies. It will also discuss some of the implica- 


tions of the findings for fakability studies in 
general. 


Review of Three Representative Studies 


Callis (1), one of the authors of the MTAI, 
worked with three groups of college students, 
The students in one group first completed the 
inventory under standard directions. 


six weeks later they repeated the in 
under directions to “fake good,” 
as high a score as possible by a 
items the way they thought a 
would. In a second group the students were 
asked to “fake good” on the first administra- 
tion of the inventory. A week to 10 days 
later they repeated the inventory under stand- 
ard directions. A third group, the control, 
was also tested twice, a week to 10 days 


part by the Fund 
arch of the School of REE 


Four to 
ventory 
i.e., to make 
nswering the 
good teacher 


1 This study was supported in 
for Occupational Rese 
tion, UCLA. 


74 


apart, and received standard directions both 
times. Callis’ results showed that the group 
which “faked good” on the second adminis- 
tration raised its mean score 9.6 points, a dif- 
ference significant at the .01 level of confi- 
dence. The mean faked score for the grouP 
that was told first to fake and then to repeat 
under standard directions, was a statistically 
insignificant 1.8 points higher than its mee 
score under standard directions. (This wou 
indicate that “Order” as a variable, Ag 
whether the subjects received standard direc- 
tions prior to faking directions or vice Wei, 
should be considered in a fakability study 
The group which worked under standard “a 
rections both times added 4.2 points to 13 
mean upon the second completion of the F 
ventory. This gain was significant at the P y 
level. Callis concluded that the MTAI “may 
be susceptible to faking to a limited extent- : 
Rabinowitz (4) also tested three gronn i 
twice. A control group responded both tim p 
under standard directions. A second gro" 
responded first under standard directions 4? 
second after being instructed to fake in 
permissive direction, i.e., as if applying f0". 
job at a school where the principal thous al 
good teachers were characterized by “mut 
affection and sympathetic understanding. à 
third group responded first under standas 
directions and second after being instruc 
to fake in an authoritarian direction, Lee 
if applying for a job at a school where ac’ 
Principal thought good teachers were chet 
terized as maintaining “relations in which a 
Pupils respect the authority of the ee a 
and the teacher accepts that authority H e 
trust.” All the subjects were tested at f 
session. None was aware that a second t° 
ing would occur, nor did any have acces 
their first answers, An analysis of vata’ sy 
was performed on the results of both the 


and second administrations. The first 
ministration of the 


tions were used, d 


a 
t 


pre 
test, where standard di nt 


id not produce signifi 


j 


Fakability of the MTAI 75 


differences among the three groups. On the 
second administration, the groups receiving 
the two sets of faking instructions responded 
differently from each other and from the 
group which proceeded under standard in- 
structions. The differences were significant 
at the 01 level. (This would indicate that 
instructions to fake in a particular direction 
Produce different results. Apparently instruc- 
tions as to direction may provide a “respond- 
ing set,” i.e., a cue as to how the inventory 
can be faked, and “direction” instructions 
become a relevant variable in fakability 
Studies.) Rabinowitz further found that the 
group which completed the inventory twice 
under standardized directions did not change 
its mean score significantly. Both groups who 
faked produced means which differed signifi- 
cantly from their original mean scores. Ra- 
binowitz concluded on the basis of these find- 
ings that the MTAI “may have limited value 
for selection purposes.” 

Sorenson (5) had two groups of elementary 
teacher candidates and two groups of second- 
ary teacher candidates complete the - inven- 
tory, first under standard directions and a 
second time as if applying for a job “in a 
School system which is known to be progres- 
sive.” Half the subjects in each group were 
told to sign their names to the answer sheets 
and half responded anonymously. All groups 
raised their means scores significantly under 
aking instructions. There was & statistically 
significant difference between the means of 
those subjects who signed and those who did 
not sign the answer sheets under standard di- 
rections, There was a significant difference 
in the mean “gains” score of the signed and 
unsigned groups. The “gains” score was the 

ifference between an individual’s scores on 

© two administrations. (Thus it is indi- 
cated that whether or not subjects sign the 
answer sheets is a relevant variable in fak- 
ability studies.) Sorenson concluded that the 
qe probably can be faked by prospective 
i chers and that a subject’s “beliefs regard- 
ae the use to which the scores will be put 
fe his understanding of the directions may 

uence his responses to an inventory- 
v. According to these three studies then, three 
ariables which should be controlled in an 10- 


vestigation of the fakability of the MTAI are 
(a) Order, whether the subjects respond to 
standard directions first or to faking direc- 
tions first; (b) Responding Set, whether or 
not the subjects are instructed to fake in a 
particular direction, since such directions may 
provide a cue as to the correct answers; (c) 
Signing, whether or not the subjects’ identi- 
ties are to be known. 


Procedures Employed in the Factorial Study 


The sample consisted of 156 students in the School 
of Education at UCLA—all candidates for the sec- 
ondary teaching credential. The subjects were ran- 
domly assigned to 12 groups. The 12 groups were 
then divided into three sets of 4 groups each. Then 
each set was divided into two pairs. All the sub- 
jects completed the MTAI twice, once having re- 
ceived only the directions printed on the cover of 
the inventory, and once having been instructed to 
fake the inventory in one of the following three 
ways. The first set of four groups was instructed 
to “assume that you are applying for a teaching po- 
sition in a school rumored to be progressive. Re- 
spond to the items of this inventory in a way that 
you think will be most likely to get you the job.” 
The second set of four groups received the same 
faking directions except that the word “traditional” 
was substituted for “progressive.” The remaining 
set of four groups was instructed to “assume that 
you are applying for a teaching position. Respond 
to the items of this inventory in a way that you 
think will be most likely to get you the job.” No 
indication as to the nature of the school was given. 

Half of the subjects, one pair of groups from each 
of the sets of four, were given the faking instruc- 
tions first and then asked to complete the inventory 
under standard directions; while the other half of 
the subjects, i.e., the alternate pairs of groups, com- 
pleted the inventory under standard directions first 
and then were asked to fake. In one group from 
each pair the students were instructed not to identify 
themselves while in the second group of each pair 
they were told to print their names on the answer 
sheets. 

Both administrations of the inventory took place 
during a single session. The students received writ- 
ten instructions. There was one proctor for each of 
the 12 groups. As soon as a subject finished the in- 
ventory, the proctor picked up his answer sheet and 
the instruction sheet, and handed him a second sheet 
of instructions and a second answer sheet numbered 
the same as his first. 

The answer sheets were scored and rescored by 
machine using an elimination key. A constant of 
100 was added to each score. “Change” scores were 
computed for each subject, and a constant of 200 
was added to each. Except for the groups which 
had faked in the traditional direction, the change 


scores were computed by subtracting the scores ob- 


76 A. G. Sorenson and M. S. Sheldon 


tained under standard directions from the faked 
scores, to determine how much the students raised 
their scores when they faked. In the case of the 
four groups who faked in the traditional direction 
the process was reversed to determine how much 
they had lowered their scores when they faked. 

The statistical treatment of the data included four 
three-way analyses of variance. The first analysis 
of variance was of the scores obtained under stand- 
ard directions, the second was of faked scores, and 
the third was of change scores. In each of these 
three analyses of variance the main effects were or- 
der, responding set or direction, and signing. Thus 
each analysis employed a 2 X 2 X 3 factorial design. 

A fourth analysis of variance was performed on 
the scores of the four groups told to fake but who 
were not given a responding set, i.e., who were given 
no suggestion as to direction. The main effects in 
this analysis were order, signing, and change (a com- 
parison of the scores under standard instructions 
with those under faked instructions). 


Results 


The results of the four analyses of variance 
appear in Tables 1 through 4. Table 1 pre- 
sents the results of the analysis of variance of 
the scores achieved under standard instruc- 
tions. None of the three main effects nor 
any of the interactions resulted in F ratios 
which are statistically significant. Although 
the means for the 12 groups varied from a 
low of 31 to a high of 60, the over-all test of 
significance indicates that these means could 
have been drawn from the same population 
by chance. 


Table 2 shows the analysis of variance of 


Table 1 


Analysis of Variance of MTAI Scores Obtained 
Under Standard Directions 


Source of Variation df MS F 
Order 1 1,628.30 1.67 
Responding set 2 2,522.33 2.59 
Identification 1 1,046.26 1.07 
Order X Responding set 2 147.26 any 
Order X Identification 1 77.57 ses 
Identification X Respond- 

ing set 2 1,595.51 1.64 
Order X Identification X 

Responding set 2 92.20 iici 
Within groups 144 974.74 
Total 155 


#* Less than unity. 


Table 2 


Analysis of Variance of MTAI Scores Obtained 
Under Faking Directions 


Source of Variation df MS F 
Order 1 18,156.98  8.18** 
Responding set 2 129,921.33 58.54** 
Identification 1 iog #e 
Order X Responding set 2 13,183.70  5.94* 
Order X Identification 1 53109'  * 
Identification X Respond- 

ing set 2 2,432.34 110 
Identification X Respond- 

ing set X order 2 864.15 *** 
Within groups 144 2,219.31 
Total 155 


* Significant at the .05 level of confidence. 
** Significant at the .01 level of confidence. 
*** Less than unity, 


the scores achieved under faking conditions: 
The F ratios for the main effects of order an 
responding set are both statistically signifi- 
cant at the .01 level of confidence.? The in- 
teraction between these variables is significant 
at the .05 level. f 

Table 3 shows the results of the analysis 0 
variance of the change scores. It will be ae 
that the F ratios for order and responding Sê 
and their interactions again are statistically 
significant. ee: 

Table 4 shows the results of the analysis ° 
variance performed on the scores of the fou" 
groups not given a responding set. The only 
effect or interaction of statistical significance 
is identification, or whether or not the sub 
jects signed the answer sheet. 


Discussion 


i e 
In light of the factorial study and E 
studies reviewed above, it would appear t 


? The Bartlett test for homogeneity was made ves 
fore carrying out any of the analyses. The varian he 
for the groups under standard directions and for b 
groups not given a responding set were found £0 er 

omogeneous. Th iy 


n e e variances for the groups Ure 
faking directions and for the change scores Y n’s 
found not to be homogeneous. However Cochran” 
test indicated that the variances were not sO ise 
rate as to affect the analyses. This, together Wie 
the Norton study (3), encouraged the present , ue 
thors to utilize the analysis of variance technid t 
A the assumption of homogeneity was 


Fakability of the MTAI 17 


e following statements can be made con- 
a ing the fakability of the MTAT as it has 
€n studied thus far: 
ee = the subjects have a responding set 
_ ances of faking successfully will ap- 
Tei y be much greater than if they do not 
the pe a cue. The difference between 
ab; ndings of Callis on the one hand and 
: inowitz and Sorenson on the other might 
m accounted for by the fact that both Ra- 
ing eg and Sorenson introduced a respond- 
ie T i.e., instructions to fake in a particu- 
2 pa whereas Callis did not. 
under en subjects complete the inventory 
ae rections to fake before they respond 
are RE instructions, the change scores 
order ‘ely to be smaller than if the reverse 
Is used. 
aa a effect of identification, or signing, is 
consi ndetermined, since the findings are not 
Stent. 
bre question of the effect of practice 
t fe MTAT is one about which the pres- 
Probapte? provides only little evidence, but 
Calleq y deserves comment. It will be re- 
Peated that when Callis’ control group Te 
tions ; the inventory under standard instruc- 
which © showed an increase in mean score 
witz Pe: statistically significant. Rabino- 
Ventor so had a control group repeat the in- 
not nl under similar conditions but it did 
Ow a significant increase in mean score. 


Table 3 
Analysis of Variance of MTAI “Change” Scores 
Source of Variation df MS F 
0 
a 1 10,700.41 4.75" 
Tenth ne Set 2 - 79,820.85 3546** 
en acation 1 3,663.69 163 
Order 5 Responding set 2 11,080.72 4.92** 
entin Identification 1 246411 109 
-fication X R z 
aet spad a gagy = 
x Identification X . 
Within nding set 7 82s ” 
ety 144 2,251.12 
otal 
155 


hac 
seg Signi 
Signifcant at the .05 level of confidence. 
Legg iicant at the 01 level of confidence: 
an unity, 


Table 4 


Analysis of Variance of MTAI Scores Obtained Under 
Standard Instructions and Under Faking 
Instructions When No Responding 
Set Was Given 


Source of Variation df MS F 


Order 1 2,233.88 2.87 
Change 1 162.49 *** 
Identification 1 4,472.34 5.74** 
Order X Change 1 1,137.86 1.46 
Order X Identification 1 mar me 

1 199.40 *** 


Identification X Change 

Identification X Change X 
Order 

Within groups 


| 13:86 =e 
96 779.06 


Total 103 


** Significant at the .01 level of confidence. 
#** Less than unity. 


In the present study the four groups which 
completed the inventory under standard di- 
rections and then repeated under instructions 
to fake, but with no responding set being 
given, did not show a significant increase. 
The present study is similar to that of Ra- 
binowitz in that the subjects repeated the in- 
ventory during the same session, whereas 
Callis’ students repeated after a delay of a 
week or more. A possibility to be considered 
is that through discussion or other means the 
subjects acquired cues which influenced their 
second performance on the inventory. 

As was suggested earlier, whether or not a 
student will attempt to fake may depend 
upon the use he expects to be made of the 
scores. It is more likely that he will attempt 
to fake in a selection situation than in a 
counseling situation. In none of the studies 
reported here were the subjects completing the 
inventory under selection conditions. They 
were only asked to pretend that they were 
performing under selection conditions.* 
aT . 

3 The authors administered the MTAI to a group 
of students as a part of the routine selection process 
in the School of Education at UCLA. It is assumed 
that these samples are from the same population 
as that reported earlier by Sorenson (5). In that 
study two groups of prospective elementary teachers 
achieved means, under standard directions, of 51 and 
45, and SD’s of 22 and 28, respectively. Each 


f two groups of prospective secondary teachers 
achieved a mean of 41 and SD’s of 30. In the 


78 


While it would appear that groups of stu- 
dents are not able to fake the MTAI, unless 
given a cue, there are at least two other ques- 
tions relative to the faking problem that de- 
serve attention. In the studies at UCLA 
there have been individuals who made sta- 
tistically significant increases in scores, even 
without a cue from the directions. How do 
such students differ from those who did not 
change their scores, or who changed in the 
wrong direction? Second, what would be the 
predictive validity of the MTAI if it were 
administered under selection conditions? 


Summary 


This study employed a factorial design to 
investigate the effects of several conditions of 
administration on the fakability of the MTAT. 
The findings indicate that in the kind of fak- 
ability studies which have been conducted 
with the MTAI, whether the subjects fake the 
test first and then respond under standard in- 
structions, or vice versa, and whether the in- 
structions give a cue as to the nature of the 
present study the prospective elementary teachers, 
N 79, achieved a mean of 46 with an SD of 29. The 
prospective secondary teachers achieved a mean of 
37 and SD of 29. Obviously these data do ‘not sup- 
port the hypothesis that students completing the 


MTAI under selection conditions, will as a group 
show higher scores. 


A. G. Sorenson and M. S. Sheldon 


inventory, will influence the results. These 
findings are discussed in relation to several 
previous studies. In general the findings sup- 
port the conclusion that groups of students 
are not likely to be able to fake the MTAL 
unless they receive a cue from the faking m- 
structions, or elsewhere, as to what the inven- 
tory is about. 


Received April 4, 1957. 


References 


1. Callis, R. Change in teacher-pupil attitudes 1e- 
lated to training and experience. Educ. p59- 
chol. Measmt, 1950, 10, 718-727. 3 
2. Coleman, W. Susceptibility of the MTAI to fak- 
ing with experienced teachers. Educ. Adminis 
tration and Supervision, 1954, 40, 234-237. f 
3. Norton, D. W. An empirical investigation i 
some effects of non-normality and heterogene 
ity on the F-distribution. In E. F. Lindau 
(Ed.), Design and analysis of experiments > 
psychology and education. Boston: Hough 
Mifflin, 1956. Pp. 78-86. os veal 
4. Rabinowitz, W. The fakability of the Minn a 
Teacher Attitude Inventory. Educ. psye” 
Measmt, 1954, 14, 657-664. 4 
5. Sorenson, A. G. A note on the “fakability 
the Minnesota Teacher Attitude Inventory- 
appl. Psychol., 1956, 40, 192-194. of 
6. Stein, H. L., & Hardy, J. A validation study i 
the Minnesota Teacher Attitude Inventory g 
Manitoba. J. educ. Res, 1957, 50, 321-33° 


» of 
J. 


Journal of Applied Psycholo| 
Vol. 42, No. 2, 1958 i = 


An Information Analysis of Verbal and Motor Responses to 
Symbolic and Conventional Arabic Numerals * 


Earl A. Alluisi and Hugh B. Martin ° 


Laboratory of Aviation Psychology, Ohio State University 


A Symbols generated from multi-element 
printing” matrices are frequently used to 
represent more conventional Arabic numerals. 
An example is the automatic score board used 
in large sporting stadiums. Recently, the fea- 
sibility of several visual coding schemes for 
displaying various types of numerical infor- 
mation in air traffic situations has been 
studied. Typical potential uses would in- 
clude an airborne “printer” for transmitting 
information to an aircraft crew (5) and elec- 
tronic “printers” for indicating the identity 
of a target blip on a cathode ray tube dis- 
Play (1, 8, 12). : 
Cohen and Webb (5) studied the use of 
Symbolic Arabic numerals that were gener- 
ated from a six-element straight-line matrix. 
hey found performance with conventional 
Numerals superior to performance with these 
Symbolic numerals. For further study, they 
Suggested the use of an eight-element matrix 
cause it appeared to provide an improved 
Series of symbolic numerals as well as a fairly 
zadable series of symbols representing the 
Tei of the English alphabet (5, pp. 8- 
? 
he symbolic Arabic numeral 
Y are based on those sugges 
and Webb, and on the results of another 
Study in which optimal symbols were selected 
Or several of the numerals (12, PP- 84-89). 


with : numerals are shown in Ae i along 
the basic eight- nt straight-line ma- 
~ the basic eight-eleme 

rt by the U. S. 


: This rese: P 
i arch was supported in pa 
Con gore? under Contract No. AF 33(616)-43, and 
wip act No. AF 33(616)-3612; Project No. R 
tio The Ohio State University Research o a 
Wri Monitored by the Aero Me ical Tahora m 
Air Development Center., Pert Le 

‘or reproduction, translation, pul ja on, 
disposal in whole or in part DY or arae 
n tates Government. The authors w a 
anal Owledge the assistance of Ilse B. Wel ee 
tions at of the data, and the many helpful co’ 

ZS of P. M. Fitts and R. W. Queal JT oratory, 
Fort K at the Army Medical Research La 


nox, Kentucky. 


s used in this 
ted by Cohen 


79 


trix from which they were drawn. For trans- 
mitting numerical information, performance 
with these numerals has been found to be not 
much different from performance with con- 
ventional numerals (1). This appears to be 
a reasonable finding in view of other evi- 
dence apparently favoring performance with 
straight-line and angular numerals over per- 
formance with more conventional figures.° 
The present study was designed to test this 
finding. 

Specifically, the present study was de- 
signed to compare the information-handling 
performance of Ss responding to two sets of 
Arabic numerals—one a set of conventional 
figures, the other an optimized set of straight- 
line symbolic figures. In addition, verbal as 
well as motor (key-pressing) responses were 
used because previous findings (1) had indi- 


K 


The Eight -Element Stroight-Line Motrix 


a 
o 


A 


| 2 


7 
7 


T4 


2 bh 
5 6 


The eight-element straight-line matrix from 
which were drawn the symbolic Arabic numerals 
used in this study. The symbolic numerals are 
identified in the figure by the smaller conventional 
Arabic numerals below the symbols. 


e 
3 For example, 


Fic. 1. 


in reviewing the results of some 13 
studies on legibility, Tinker (15) concluded that 
maximum legibility was obtained with Roman capi- 
tals—figures made up almost entirely of straight 
lines and sharp angles. Berger (3), in designing 
numerals to give optimal visibility for white letter- 
ing on black, patented a series of numerals also con- 
sisting almost entirely of straight lines. More re- 
cently, Lansdell (7) compared two standard sets of 
numerals (the Mound and the Mackworth) with a 
new set of angularly formed numerals, and found 
performance under difficult viewing conditions to be 
better with the new set than with the standard. 


80 


cated the possibility of an interaction be- 
tween the two types of numerals and these 
two modes of response. Finally, practice ef- 
fects were studied for both types of numerals 
and for both modes of response. 


Method 


The experiment was conducted in two parts. In 
Part M, 24 Ss made motor (key-pressing) responses 
to the different stimuli over a period of two days, 
and five Ss continued responding over 10 additional 
days. In Part V, verbal responses were made by a 
different group of Ss under otherwise identical con- 
ditions. 

Apparatus. A Serial Discrimeter, designed and 
constructed in the Laboratory of Aviation Psychol- 
ogy, The Ohio State University, was the apparatus 
used. It has been described elsewhere (1, 10, 11) 
and is similar in function to an instrument described 
by Morin and Grant (9). Essentially, it consists of 
five basic components: Programming unit, switch- 
ing unit, display unit, response unit, and scoring 
unit. 

The Programming unit consists of a board con- 
taining 100 rows of ten single-pole double-throw 
toggle switches each. One of the ten switches in 
each row is set by E; this fixes the sequence of 
stimuli to be activated on each of 100 serial stimu- 
lus presentations. 

The switching unit interrogates the rows of the 
Programming unit in serial order and transmits sig- 
nals to the display unit. Under self-pacing condi- 
tions, the switch advances from one row to the next 
whenever a response is made by S. 

Many different display units and response units 
can be used with the apparatus. The display used 
in the present study consisted of a 10-in. diameter 
opal glass screen. The various symbols were pro- 
jected onto the screen from the back by means of a 
ten-unit optical projector, each unit of which con- 
tained a different photographic transparency. 

The response unit used in Part M consisted of a 
bank of ten finger keys placed directly in front of 
S on a table. The keys were arranged horizontally 
in two semicircles to correspond with the natural 
placement of the ten finger tips. The finger keys 
were numbered, from left to right, 1, 2, 3, aay By O 
(0 represented 10), and $ was told to respond by 
pressing the key corresponding to the specific sym- 
bol presented on the screen, The response unit used 
in Part V consisted of a boom microphone connected 
to a square-wave impulse amplifier, and § responded 
by speaking into the microphone (“one 
ete.). During both parts, 
proximately 70 db. was pre 
phones in order to mask e 

The scoring unit consists of two elements: 
ing element and an automatic recording element. 
The timing element used here w: 


h as a Standard Elec- 
tric Timer on which was recorded the total time for 


2? two? 
broad-band noise at ap- 
sented to § through ear- 
xtraneous sounds, 


a tim- 


Earl A. Alluisi and Hugh B. Martin 


a series of 100 stimulus presentations. The auto- 
matic recording element consists of a two-dimen- 
sional matrix containing 110 three-place electrome- 
chanical counters arranged in a 10X 11 stimulus- 
response matrix. 

Although the timing element was used during both 
parts of the experiment, the automatic recording ele- 
ment was used only with the key-pressing responses 
of Part M. It was not used in Part V because the 
apparatus could not discriminate among the differ- 
ent verbal responses. Error patterns in Part V were 
recorded by a monitor using lists of the programme 
stimuli; audiotape recordings were also taken for 
use in resolving any ambiguity in this scoring. 

Stimuli. Two sets of Arabic numerals, one sym- 
bolic and one conventional, were used in this hel 
periment. The symbolic numerals were straight-line 
figures generated from an eight-element mairi) 
these were illustrated in Fig. 1. The conventiona 
numerals were the AND-10400 numerals recon 
mended by Baker and Grether (2) for use in in 
strument identification. The symbols from each a 
the two sets appeared in the experiment as ait 
high light patterns at the center of the display, 3P 
proximately 28 in. in front of the seated S. 4 

Subjects. The Ss were 48 men obtained from 


were randomly divided into two groups of ae a 
each; each group served in only one of the Bs 
parts (M or V) of the experiment. cee (OE 
Procedure. Each S responded for one seesi ais 
five trials on each of two successive days. In @ e 
tion, five volunteer Ss from each group gum e 
to respond for a five-trial session on each of ea 
ten succeeding working days for a total of 12 5 
sions, or 60 trials in all, jons 
Each trial consisted of 200 stimulus presentati 
—100 presentations of stimuli from each of the on- 
sets of numerals. Each series of 100 stimuli “ine 
sisted of ten of each of the symbols in either als. 
set of conventional or the set of symbolic nume" nd 
The order of symbols in each series was random ® 
different for each series. ded 
The 12 odd-numbered Ss in each group reaper alt 
to the conventional numerals during the first the 
of cach trial in the first session, and during 
second half of each trial in the second session; © 19 
ing the remainder of their trials, they responde 
the symbolic numerals. The even-numbered $$ 


Sponded to the two sets of numerals in a comP 
mentary order, 


A short rest 
given between 
between the tri 


ree 
le- 


was 
d alse 
a half 


the s hich he was to work yem 
arization with the numerals was > 

a $ 

In Part M, S w; ee 


H ri 
as instructed to respond by P pol 
NE a corresponding finger key whenever a SY” 4 


appeared. In Part V, S was instructed to respo” 


Symbolic and Conventional Arabic Numerals 81 


verly by calling out the numeral displayed. In 
oth parts of the experiment, S was instructed to 
respond as rapidly as he could, but to maintain as 


far as possible an error-level below 5 per cent of 


the responses in each trial. 


Results 


The data obtained were summarized in the 
form of stimulus-response matrices—one ma- 
trix for each 100 responses made by one S to 
one set of numerals during one of the trials. 
From each such matrix, the amount of infor- 
mation transmitted (in bits/stimulus) was 
computed with procedures that have been de- 
scribed elsewhere by Shannon and Weaver 
(14) and others (6, 13). 

Each of these scores was then divided by 
the total time S had taken in making his 100 
responses, and the amount of information 
transmitted by the average S (in bits/sec.) 
Was computed as the arithmetic mean of the 
Scores for individual Ss. The results obtained 
a the two parts of the experiment are shown 
in Fig. 2 for the two sessions during which 
24 Ss participated with each mode of re- 
Sponse, 

The statistical significance of the differ- 
ences here (and throughout the experiment) 
Was tested by use of a nonparametric method 
Siy Sign tests made at each of the succes- 

e trials indicated the following: (4) With 
Verbal responses, the amount of information 


a 
fe} 


First Session 


A 
a 


> 
oO 


w 
a 


ow 
o 


nN 
a 


. gie Responses 
e Motor 

= Conventional ) Arobic 

=-- Symbolic Numer 


formation Transmitted (bits/sec.) 
o 


Trials 
F; i 
in 16. 2, Information transmitted b 


y the average S 
nses to conven- 
Is as a function 
upon the re- 
Jus presenta- 


mae verbal and motor resp 
and symbolic Arabic numera 
orntttice. Each data point is based 
loge of each of 24 Ss to 100 stimi"? Diferent 
Brou (or 2,400 responses per point). bal and 
Mote of Ss were used in making the verba 

°F responses, 


tio 


Table 1 


Mean Time and Mean Number of Errors Per 100 Verbal 
and Motor Responses to Conventional and 
to Symbolic Arabic Numerals 
During Two Sessions 


Mean Number of 
Errors per 
100 Responses 


Mean Time per 
100 Responses 
(sec.) 


Session Session Session Session 
I II I Il 


Verbal Responses 


Conventional 78.9 75.0 0.267 0.333 

Symbolic 92.4 84.4 0.725 0.658 

P of difference s haad wk: * 
Motor Responses 

Conventional 100.6 88.2 3.150 2.942 

Symbolic 107.5 88.5 3.283 2.317 

P of difference — = st 


Note.—Motor responses were made by 24 Ss in Part M, and 
verbal responses were made in Part V by a different group of 
24 Ss. During each session, each S responded to five trials of 
100 stimulus presentations from each of the two sets of numerals, 
Thus, each of the eight time (or error) means reported is based 
upon 5 X 24 = 120 trials of 100 stimulus presentations per 
trial, or upon 12,000 responses in all. 

* P <05. 
+*+ pP <01. 

wrk P < .001, 


transmitted with the conventional numerals 
was significantly greater than that trans- 
mitted with the symbolic on each of the ten 
trials (P < .001 in each case). (b) When 
motor responses were made, however, only 
the differences obtained in the first and sec- 
ond trials were statistically significant (P < 
‘Ol and P < .05, respectively). 

Separate summaries of the time and error 
aspects of performance are presented in 
Table 1. These data indicate that, when 
verbal responses were made, the conventional 
numerals were used with greater speed and 
accuracy than were the symbolic numerals; 
this was true for both sessions. With motor 
responses, however, only during the first ses- 
sion were the conventional numerals used 
with greater speed than the symbolic, and 
during the second session the symbolic nu- 
merals were apparently used with greater ac- 
curacy than were the conventional; during 
both sessions, however, the differences in ac- 


curacy were relatively small with the motor 


responses. 
Effects of longer-term practice. Ten Ss 


(five each in Parts M and V) continued re- 


sponding for a total of 12 sessions in order 


oo 
N 


G 


a 
o 


© Verbal 
e Motor 
— Conventional } Arobic 
=== Symbolic Numerals 


—, 


w 
a 


w 
5 


Information Transmitted (bitsAec) 
a 


2 S F Ea 
Sessions (Blocks of Five Trials) 


Fic. 3. Information transmitted by the average S$ 
in making verbal and motor responses to conven- 
tional and symbolic Arabic numerals as a function 
of long-term practice. Each data point is based 
upon the responses of each of five Ss to five trials 
of 100 stimulus Presentations (or 2,500 responses 
per point). Different groups of Ss were used in 
making the verbal and motor responses. 


that data might be obtained under conditions 
of longer-term Practice. The information- 
handling performances of these Ss are shown 
in Fig. 3. 

Tests of statistical significance applied to 
these data indicated the following: (a) When 
verbal responses were made, the amount of 
information transmitted with the conven- 
tional numerals was significantly (P < 05) 
greater than that transmitted with the sym- 
bolic on all but two of the 12 sessions (Ses- 
sions 6 and 10); in both sessions, however, 
the direction of the difference also favored 
the conventional numerals. (b) When motor 
responses were made, only the differences ob- 
tained in two sessions (Sessions 5 and 12) 
were statistically significant (P < .05), and 
whereas the direction of one favored one type 
of numeral, the direction of the other favored 
the other type of numeral, 

It was found in averaging the verbal-re- 
sponse data of the first and the last six ses- 
sions, that all five Ss transmitted more infor- 
mation with the conventional numerals than 
with the symbolic, Only three of the five 
Ss making motor responses transmitted more 
information with the conventional numerals 
during the first six sessions, however, and 
during the last six sessions four transmitted 
more information with the conventional than 
with the symbolic numerals, 

The trend with continued practice appears 


Earl A. Alluisi and Hugh B. Martin 


to be, then, that (a) the apparent superiority 
of the conventional over the symbolic Arabic 
numerals is retained when verbal responses 
are made, and (b) even though there is ng 
difference between the numerals in ae 
motor-response performance, the same sor! 
of superiority may become apparent after 
fairly long periods of practice. 3 
Separate summaries of the time and erro 
scores are presented in Table 2 for these aa 
based on longer-term practice. According 4 
these data, the conventional numerals wer" 
used with greater speed and greater accurági 
than the symbolic when verbal responses Mie. 
made; this was true for both the first and ' ; 
last six sessions. Although they were aa 
used with greater speed during both sets a 
sessions with the motor responses, the coni 
ventional numerals were used with less ad 
curacy during the first set of six a 
the difference was in the same direction, 
not statistically significant, during the seco? 
set of six sessions. 


Discussion 


le 

On an a priori basis, it seemed reasonab 

to suppose that information-handling perio z 
ance with the symbolic numerals might 


Table 2 


al 
Mean Time and Mean Number of Errors Per 100 Ver 
and Motor Responses to Conventional and 
to Symbolic Arabic Numerals 
During 12 Sessions 


rof 
Mean Time per Mean Numbe 


Errors pet 
eS ao 100 Responses — 
‘ aon 
Sessions Sessions Sessions Sess} 
1-6 7-12 1-6 
Verbal Responses 540 
Conventional 73.9 63.2 0307 Qia 
Symbolic 796 671 0.693: “ae 
Pofdifference  » * ig 
Motor Responses 547 
Conventional 81,9 70.7 4.420 4500 
Symbolic 82.9 72.0 3.587 = 
Poof difference + * % 


anses were made in Part V by a different trial 
7 ring each session, ded to 8 
Of 100 stimma sig each session, each 3 responded to 


Symbolic and Conventional Arabic Numerals 83 


at least equaled that with the conventional 
numerals, This seemed especially likely in 
view of (a) the improved likenesses obtained 
from the eight-element over the six-element 
matrix (5), (b) the selection of optimal sym- 
bols from the eight-element matrix (12), and 
(c) the evidence apparently favoring straight- 
line and angular figures over more conven- 
tional curved-line figures (3, 7, 15). The 
data of the present experiment did not cor- 
toborate this supposition. 

Instead, an interaction between the two 
types of numerals and the two modes of re- 
sponse, first noted in a previous study (1), 
Was again evidenced here. In terms of in- 
formation handling (bits/sec.), time, and er- 
Tors, performance with the conventional nu- 
Merals was consistently superior to perform- 
ance with the symbolic numerals when verbal 
responses were made. No such clear superi- 
Ority for either set of numerals was evidenced 
in the motor-response performances. s 

During the very earliest stages of practice 
With the motor responses, the conventional 
numerals appeared to be superior to the sym- 

olic in terms of information-handling per- 
formances, This difference did not appear 

uring later stages of practice, but there was 
Suggestion that the initial superiority might 
Sain be found after considerable practice. 

The initial superiority of conventional nu- 
Merals with motor responses appeared to be 
à function of a superiority in speed, rather 

an a function of a difference in accuracy. 

uring the later stages of practice with the 
entor responses, there appeared to be only 
Small differences between the two sets of nu- 
Merals, and these differences continued in the 
~'tection of greater speed, but less accuracy 
R Performance with the conventional numer- 
als than with the symbolic. 
Mo might speculate as to th 
ingestion and the failure to © 
a ial supposition of no difference 
Nee with the two sets of numerals. 
ot: leses are suggested. 
<a an argument mig 
fect S of stimulus-response CO 
con, Because number-namin 
ventional Arabic numeral 
erlearned in our culture, it mis 


e cause of this 
orroborate the 
in perform- 
Two hy- 


ht be offered in 
mpatibility ef- 
g responses to 
s are greatly 
ht be argued 


that they form a highly compatible stimulus- 
response ensemble—so compatible, in fact, 
that any perceptible change in the figures (as 
in forming the symbolic numerals) results in 
a less-compatible ensemble (as measured by 
lower performance). 

On the other hand, the key-pressing re- 
sponses are less practiced relative to the ver- 
bal responses and form, therefore, a less-com- 
patible ensemble with either set of numerals 
(i.e., performance with motor responses lower 
than with verbal). Also, the differences be- 
tween the less-compatible ensembles formed 
with motor responses should be less affected 
by changes in stimulus figures (i.e., the dif- 
ference between performances with the two 
sets of numerals should be smaller with mo- 
tor responses than with verbal). This argu- 
ment is consistent with the data. 

A second line of argument would suggest 
that performance with straight-line and an- 
gular figures is superior to performance with 
conventional numerals only under difficult or 
threshold-like viewing situations in which 
probability of correct identification, and not 
response time, is measured. This might be 
taken as an explanation of the differences be- 
tween the findings with verbal responses in 
the present study and the earlier findings of 
Tinker (15), Berger (3), and Lansdell (7). 
Tt would not explain the interaction between 
the types of numerals and responses found in 
the present and a previous (1) study. 

Both hypotheses appear reasonable; both 
are consistent with the data. Additional data 
are needed, however, before either can be 


said to be valid. 


Summary and Conclusions 


This experiment was designed to compare 
the jnformation-handling performance of Ss 
in making verbal and motor responses to two 
sets of Arabic numerals—one a set of con- 
ventional figures, the other a set of symbolic 
figures drawn from an eight-element straight- 
line matrix. The motor (key-pressing) re- 
sponses to the different stimuli were made by 
a group of 24 Ss over a period of two days, 
and by five Ss over a longer period of 12 
days. An identical number of different Ss 


NG COLLEGE 


84 Earl A. Alluisi and Hugh B. Martin 


made verbal (number-naming) responses for 
the same length periods. 

When verbal responses were made, the con- 
ventional numerals were consistently superior 
in performance to the symbolic numerals. 
This was true whether performance was meas- 
ured in terms of information handling (in 
bits/sec.), time, or errors. No such clear su- 
periority was evidenced for either set of nu- 
merals when motor responses were made. 

It was suggested that this interaction of 
numeral type with response mode might be a 
stimulus-response compatibility effect result- 
ing from use of the much-practiced ensemble 
of number-naming responses to conventional 
Arabic numerals. It was also hypothesized, 
considering the data of other investigators, 
that performance with straight-line and an- 
gular figures should be superior to perform- 
ance with conventional numerals under diffi- 
cult or threshold-like viewing situations as, 
for example, in visibility studies, but not 
necessarily superior under speeded-response 
conditions with stimuli above threshold. 

With regard to practical applications, the 
numerals formed by the use of an eight-ele- 
ment “printing” matrix do not appear to be 
quite as satisfactory as standard AND-10400 
numerals. They should not be used if other 
considerations are equal, but should their use 
be dictated by expediency the result should 
be only a small drop in information-handling 
performance. 


Received April 5, 1957. 


References 


1, Alluisi, E. A, & Muller, P. F., Jr. Rate of in- 
formation transfer with seven symbolic visual 


codes: motor and verbal responses. USAF 
WADC Tech. Rep., 1956, No. 56-226. 


2. Baker, C. A., & Grether, W. F. Visual presenta- 
tion of information. USAF WADC Tech. 
Rep., 1954, No. 54-160. 

3. Berger, C. Stroke-width, form and horizontal 
spacing of numerals as determinants of the 
threshold of recognition. J. appl. Psychol, 
1944, 28, 208-231; 336-346. ; 

4. Brandt, A. E. A test for significance in a unique 
sample. J. Amer, statist. Ass., 1933, 38, 434- 
437. 

. Cohen, J., & Webb., Ilse B. An experiment on 
the coding of numerals for tape presentation. 
USAF WADC Tech. Rep., 1953, No. 54-86. 

6. Garner, W. R., & Hake, H. W. The amount of 
information in absolute judgments. Psychol. 
Rev., 1951, 58, 446-459. me 

7. Lansdell, H. Effects of form on the legibility 
of numbers. Canad. J. Psychol., 1954, 8, 17- 
79. 

8. Learner, D. B., & Alluisi, E. A. Comparison of 
four methods of encoding elevation informa- 
tion with complex line-inclination symbols. 
USAF WADC Tech. Note, 1956, No. 56-485. 

9. Morin, R. E & Grant, D. A. Learning and 
performance on a key-pressing task as func- 
tion of the degree of spatial stimulus-response 
correspondence. J. exp. Psychol., 1955, 4% 
39-47, 

10. Muller, P. F., Jr. Efficiency of verbal vs. motor 
responses in handling information encoded by 
means of colors and light patterns. USAF 
WADC Tech. Rep., 1955, No. 55-472. ý 

11. Muller, P. F., Jr. Verbalization as a factor in 
verbal vs. motor responses to visual stimuli. 
Unpublished doctoral dissertation, Ohio State 
Univer., 1955. 

12. Muller, P. F., Jr, Sidorsky, R. C., Slivinske 
A. J„ Alluisi, E. A., & Fitts, P. M. The sy™- 
bolic coding of information on cathode Be! 
tubes and similar displays. USAF WAD 
Tech. Rep., 1955, No. 55-375, 

13. Quastler, H. (Ed.). Information theory in P59 
chology. Glencoe, Ill.: Free Press, 1955. 


on 


14. Shannon, C., & Weaver, W. The mathematical 


theory of communication. Urbana: Unive: 
Illinois Press, 1949. 


15. Tinker, M. A. Relative legibility of letters and 
digits. J. gen. Psychol., 1928, 1, 472-496. 


Journal of Applie 
Vol. 42, fd aplied Psychotory 


Relationship Between Measured Interest Patterns and 
Satisfactory Vocational Adjustment for Air Force 
Officers in the Comptroller and Personnel 
Fields * 


George W. England and Donald G. Paterson ° 


University of 


Sos Industrial Relations Center at the Uni- 
of st of Minnesota has undertaken a series 
(Stron les concerning the measured interests 
bisa. ty Vocational Interest Blank) of Air 
troller Ponce in the Personnel and Comp- 
in the , ccountant Fields. The initial study 
ad series showed that neither officer group 
hie on : interest patterns similar to 
findine their civilian counterparts (1)- This 
individ suggested the possibility of classifying 
Cupati ual profiles within a given military 0C- 
croational area as Like or Unlike the av- 
the ș profile of a civilian criterion group IM 
it eu occupational area. Subsequently, 
hlike then be determined if the Like and 
and e groups differed on personal history 
nS data information. Of special sig- 
sg for the present report are differences 
tin, n Like and Unlike groups on items re- 
8 to satisfactory vocational adjustment. 
SWer Le study (2) is an attempt to an- 
Bree q e following questions: To what de- 
tellect ve the Strong V ocational Interest Blank 
Office Satisfactory vocational adjustment for 
ts in the comptroller and personnel fields? 


The Sample 


nitg tes who were stationed in the 
Direc States as of 1 August 1952 and had 
= Personnel or Comptroller oF Ac- 
y the United 


tp 
St his st r 
ates A; udy was supported in part by 
ay, mo i Force Gndee Contract Number AF 18(600) - 
tute, Muitored by Human Resources Research Insti- 
4 Branto VEL Air Force Base, Alabama. Pimi m 
Se a €d for reproduction, translation, publica ian, 
isposal in whole and in part by oF off jal 
This is not an Omen 
opinions 


Views Or 
be construed 


tates Gov 
& i overnment. 
Pressed or naer the contract. 
th Rcessa ce Implied herein are not to 
Z Dente reflecting the views oF indorsement a 
eh a aent of the Air Force or of the Air 
fr, Assi evelopment Command. 
rm Pagince during various phases of the 
: McCoy, G. Jenson, Harry E. Roadman, am 
um is greatfully acknowledged. m 


research 
d Ernest 


Minnesota 


countant-Auditor Staff as their primary Air 
Force Specialty (AFS) designation, first ad- 
ditional AFS, or duty AFS designation were 
sent Strong Vocational Interest Blanks and 
personal history questionnaires. Random sam- 
ples of 600 officers in the AFS’s of Personnel 
Staff Officer and Personnel Officer also re- 
ceived the material. 

Table 1 gives the number of officers in each 
AFS to whom the material was sent, the re- 
turn, the percentage of return, and the num- 
ber of usable returns. The correction factor 
includes those officers whose materials were 
returned because of incorrect addresses, death 

The Corrected N Sent in- 


of the officer, etc. 
cludes just those officers who presumably re- 


ceived the materials. 
Not all returns could be used. Of the 
1,470 officers who returned the material, in- 
formation from 72 could not be used. This 
was 4.9 per cent of the total returned. The 
most frequent reason for nonusable returns 
was not completing correctly the Strong Vo- 
cational Interest Blank or the Personal Data 
Sheet. The next largest group consisted of 
those officers who reported no AFS and no 
duty assignments in the areas under study. 
The survey returns, then, contained two 
kinds of research data: completed Hankes an- 
swer sheets for the Strong Vocational Inter- 
est Blank and responses to a personal history 
questionnaire. The personal history blank 
requested information about the following: 


Age 

Marital status 

Number of dependents 

Highest year completed in school 
College degrees 

College major 

Years of civilian comptroller or person- 


nel experience 


NDANE 


86 George W. England and Donald G. Paterson 
Table 1 
Summary of Returns 
No. of 
N Corrected N Usable 
AFS Sent Correction W Sent Returned Percentage Returns 
Comptroller 232 z 225 187 83.1 ma 
Acc’t-Aud. Staff 95 5 90 75 83.3 a 
Director of Pers. 326 13 313 253 80.8 RA 
Pers. Staff Officer 600 29 571 477 83.5 p 
Pers. Officer 600 43 557 478 85.8 4 
Total 1,853 97 1,756 1,470 83.7 1,398 
8. 


Date of first entry into military service 
Total active commissioned service 
Duty AFSC 

AF Component (Regular, Reserve, etc.) 
Military rank 

Time since last promotion 

Comptroller or personnel training in 
the military 

Comptroller or personnel experience in 
the military 

Choice of AF duty 


Choice of civilian occupation if released 
from AF duty 


9 
10, 
TI 
12 
13, 
14, 


15. 


16. 
1 


Methods and Procedure 


The SVIB profiles of both the comptroller and 
personnel groups were separated into Like and Un- 
like groups on the basis of profile similarity to their 
civilian counterparts. Strong’s criterion groups of 
accountants and personnel directors provided stand- 
ards for this separation (3): 

By using the mean standard score profile of 
rong’s criterion group of accountants, upper and 
wer cutting scores were established on six occu- 


St 
lo 


Table 2 


Classification of SVIB Profiles According to Similarity 
to an Appropriate Criterion Profile 


Comptroller Group Personnel Group 

= eee 
Per- Per- 

Classification N centage N centage 
Like 78 321 464 40.2 
Indeterminate 108 44.4 339 29.3 
Unlike S 235 352 30.5 
Total 243 100 1,155 100 


pational scales: C.P.A., Senior C.P.A. Accountants 
Office Man, Purchasing Agent, and Banker. THES 
six scales were chosen because they are the scales on 
which Strong’s civilian accountants had their highe 
average scores. M. 

The following procedure was used to obtain pe, d 
ting scores for determining the comptroller Like a 
Unlike groups. The mean standard score of the on 
terion accountants on these six scales was compute E 
this was a standard score of 43, Any offcer who 
mean standard score on these six scales was 43 = 
above was classified in the Like group. The apie. 
cutting score was set at one standard deviation ia 
low the mean standard score of the criterion gro re 
of accountants on the six scales, This resulted i 
all the profiles with a mean standard score of Sia 
less on these six scales being classified in the Une 
group. Those cases between the high and low CU 
ting score were considered “Indeterminate.” ros 

Similarly, by using the mean standard score P 
file of Strong’s Personnel Director criterion gro Fe 
upper and lower cutting scores were established ub- 
two occupational scales: Personnel Director and for 
lic Administrator. These two scales were chosen ga 
the personnel group because they are the scales d 
which Strong’s civilian personnel directors ne 
highest on the average. The Like and Unlike gt Hie 
of personnel officers were obtained by the same Path 
cedure described for the accountant group but “id: 
respect to the Personnel Director and Public 
ministrator Scales, Again, the cases between ae 
high and low cutting scores were considered aS < 
determinate.” Table 2 shows the resulting class 
cation of the 1,398 profiles. 


A s< jnter- 
Strong designated two primary uses for his int 
est inventory, 


and consequently has used a differ 
criterion for the evaluation of each, First, he he 
Posed that men engaged in occupations have dana 
teristic interest patterns that differentiate them soa 
other occupations. Strong offers a wealth of ce 
(3, Chaps. 7-9) to support the validity of this z 
of SVIB. The second use of the SVIB was to P" 
dict the “satisfactory Occupational adjustment” © 
man. Interest research workers have generally ¥5° 


7 


Measured Interest Patterns and Vocational Adjustment 87 


two factors, “continuance in an occupation” and 
eee vocational preference,” to evaluate this 
ee of the SVIB. The criterion of “continu- 
eee occupation” has the disadvantage that, 
he, nae personal and economic pressures, not all 
to d esire ta leave or to enter a vocation are free 
af o so, “Expressed vocational preference” has the 
lisadvantage of the instability common to most such 
Single statements. 
fae items in this study, choice of Air Force duty 
S choice of a future civilian occupation, would 
i. to be particularly stable “preference” factors 
ae (a) the men are mature, as evidenced by 
Rae ian age of 41.5 years for the comptroller group 
na ee years for the personnel group; (b) the men 
ee with their occupations, as shown by a 
a of military comptroller experience of $2 
iion and 3.6 years for the personnel group, 1m ad- 
have Be whatever civilian experience they might 
the sı ad; and (c) the two items combined required 
and ubject to project himself over two different life 
nibs work situations, duty in the Air Force and a 
eee civilian job, These factors seem, in con- 
B to other interest studies discussed by Strong 
eee 389), to provide a more stable criterion of in- 
ane measurement than such a measure as the vo- 
p onal choice of an adolescent high school or col- 
men, Student who is relatively unfamiliar with the 
Yriad of occupations in the world of work. | 
ain adopted procedure, then, was to determine the 
. ionship between the two preference items (Air 
Tee duty and choice of civilian occupation) and 


Me; K 
asured interest patterns. 


Results and Discussion 

al history item, 
Air Force duty, 
”, 64 per cent 


Ta response to the person: 
ae had your choice of 
ich duty would you choose? 


of the total comptroller sample and 53 per 
cent of the total personnel sample expressed 
a preference for duty in their present occupa- 
tional field. Table 3 shows that comptrollers 
with interest patterns like successful civilian 
accountants stated a preference for the comp- 
troller occupational field much more fre- 
quently than those with interest patterns un- 
like successful civilian accountants. This dif- 
ference is significant at the .0001 level. 
Within the personnel group the relationship 
between “choice of Air Force duty” and 
“Like-Unlike” profile classification was not 
statistically significant. 

In response to the personal history item, 
“Tf you were released from the Air Force, 
what civilian occupation would you like to be 
engaged in?”, 39 per cent of the total comp- 
troller sample and 36 per cent of the total 
personnel sample expressed a preference for 
their present occupation if released from the 
Air Force. Table 3 shows that the Like 
groups (both comptroller and personnel) state 
a preference for their present occupation much 
more frequently than the Unlike groups. 
Both differences are significant at the .0001 
level. 

Table 4 shows the relationship between the 
two preference items combined and measured 
interest patterns for both the comptroller and 
personnel groups. 

The results do not attest to the adequacy 

of the combined criterion. - What they do in- 


Table 3 


Percentage Relationships and Pr 


obability Levels Betw 


seen Like and Unlike Groups and Two Vocational 
nd Personnel Officers 


Preference Items for Comptroller a 
Total Like Unlike F 
Sample Group Group s Probability 
Comptroller Group (N= 23) N= 78) W=57) Diff. Pani 
roller 
Cho; 
ae of Air Force duty Pr 08 ng oe ae 
ig Bee choosing comptroller duty) i 
( © of Civilian Occupation 39.0 61.5 15.8 45.7 P <.0001 
Percentage choosing comptroller occup:) X 
Personnel Group (N= 1155) (V= 464) (N= 352) 
ne: Ti 
Cho; 
oe of Air Force duty a Ra Fe a Ro 
Choigg oo tage choosing personnel duty) D 
of Civilian Occupation 360 Ae PT ay S 


er 
centage choosing personnel oceup:) pua 


88 


George W. England and Donald G. Paterson 


Table 4 


Relationship of Interest Profiles to a Combined Criterion 


Total 


Like Unlike Probability 
Description Sample Group Group Diff. Level 
Percentage of Comptroller sample choosing 
Comptroller for both Air Force duty and 7 g ot 
civilian occupation 33.3 55.1 10.5 44.6 P <.0 
Percentage of Personnel sample choosing 
Personnel for both Air Force duty and i 
civilian occupation 26.8 29.7 n5 18.2 P <.00 


dicate is that if the combined criterion can be 
presumed to be adequate on the basis of the 
preceding discussion, there is evidence for the 
validity of the SVIB in predicting “satisfac- 
tory vocational adjustment” for the military 


occupational population represented by this 
sample of officers. 


Summary and Conclusions 


This investigation attempted to discover 
the relationships between measured interest 
patterns (SVIB) and satisfactory vocational 
adjustment for Air Force officers in the Comp- 
troller-Accountant and Personnel Fields. The 
conclusions which seem warranted on the ba- 
sis of the research findings are: 

1. Strong Vocational Interest Blank reflects 
the degree of satisfactory vocational adjust- 
ment for Air Force officers in the comptroller 
field. This is shown in three ways. A sig- 
nificantly larger Proportion of the group with 
measured interests similar to those 
cessful accountants in business and i 
state: (a) a preference for Air Force duty in 
the comptroller specialties; (b) a preference 
to engage in comptroller occupations in civi- 
lian life if released from Air Force duty; and 
(c) a preference to engage in comptroller oc- 
Cupations for both Air Force duty and civil- 
ian occupation—as compared with the group 
whose measured interest patterns are not 
similar to successful civilian accountants. 

2. The Strong Vocational Interest Blank re- 
flects the degree of satisfactory vocational ad- 
justment for Air Force officers in the person- 


of suc- 
ndustry 


nel field. This is shown in two ways. A sig- 
nificantly larger proportion of the group with 
measured interests similar to those of success- 
ful personnel directors in business and indus- 
try state: (a) a preference to engage in peni 
sonnel occupations in civilian life if release 
from Air Force duty; and (b) a preference to 
engage in personnel occupations for both Air 
Force duty and civilian occupation—as a 
pared to the group whose measured interes 
patterns are not similar to successful civilian 
personnel directors. : 

3. Such evidence justifies the conclusion 
that measured interests should receive in- 
creased emphasis as a factor in military Ser 
lection and classification procedures for Ait 
Force officer specialists. 


Received April 5, 1957. 


References 


1. Paterson, D. G., Jenson, P. G., & England, G- w 
The measured vocational interests of obi 
assigned to positions in the personnel a 
comptroller fields. Maxwell Air Force BA 
Alabama: Air Research and Develop™ ie 
Command, Human Resources Research Ins 
tute, 1954. (Tech. Res. Rep. No. 21.) nip 

2, Paterson, D. G, & England, G. W. Relation ars 
of measured interests to career data of gme 
assigned to positions in the personnel g 
comptroller fields. Lackland Air Force be: 
San Antonio, Texas: Air Force Personnel 4 j 
Training Research Center, 1956. (Res. Re 
AFPTRC-TN-56-44.) ‘i 

3. Strong, E; K., Jr. Vocational interests of "° 


A SS; 
and women. Stanford: Stanford Univer. Pre 
1943, 


Journal of Appli " 
Vol. ng At plied E 


The Effect of Specific Selection Sets ona Forced-Choice 
Self-Description Inventory ` 


Robert E. Krug * 


Carnegie Institute of Technology 


_ A major objective of the forced-choice scale 
is the control of transparency and hence of 
biasability. The vehicle of control is the 
equivalence of general desirability which ob- 
tains for items within a set (pair, triad, etc.). 
Since alternatives are equally favorable, S is 
Presumably deprived of the opportunity to 
describe himself in a consistently favorable 
Manner. Several studies (3, 6, 8) have shown 
that this is a tenable assumption; Ss do not 
improve their scores on forced-choice scales 
When instructed to describe themselves in the 
most favorable light. However, Gordon (4) 
finds that gains are made on two scales of the 
Gordon Personal Profile when scores from 
8uidance and employment conditions are com- 
Pared, while scores on the two remaining keys 
decrease, This suggests that S’s motivation 
an employment situation may operate more 
oe than a tendency to describe him- 
favorably. A given situation may sug- 
a that certain qualities are relevant, some 
Which may relate to keys of the inventory. 
Wo studies on the Jurgensen Classification 
nventory (8, 9) show that when Ss are 
asked to describe themselves as self confident, 
Scores on a self confidence key increase $18- 
nificantly, While the mean score on this key 
a not increase when Ss are asked to 8s 
me that they are taking the test as part 
a selection procedure, the correlation be- 
Ween guidance and selection sets is but .50, 
guticating that scores change, some ge 
sen, „downward (8). For the purposes © 
ù ection, any change from the true score 1S 
ndesirable, 
intr he present study investigates the effect A 
«toducing knowledge of an employers © 
“tives into an “assumed selection” situation, 
ete this knowledge is relevant to some 
So i were resented at 
Re 987 pete: of th tere Peychological 
oe ah Ann Beltz for 


€ author wishes to thank Sar: 
ming most of the computations. 


i 


Perfor 


89 


scale of the inventory employed. Like other 
studies which utilize “assumed selection” sets, 
the study may be criticized as lacking realism 
and consequently as lacking relevance. The 
evidence for a defense consists of the specific 
instructions given and the reports of Ss con- 
cerning their behavior. 


Procedure 


The Ss were 46 junior men in a college of engi- 
Their participation in a two-hour testing 
session partially fulfilled a requirement of the in- 
troductory psychology course. Each S completed the 
Ghiselli Self Description Inventory under each of 
seven conditions. This scale, which is described else- 
where (1), consists of 64 pairs of adjectives, half of 
the pairs presenting two favorable terms, the other 
half presenting two unfavorable terms. Items re- 
ceive weighted scores on five empirically derived 
keys. The intelligence key is composed of 36 items, 
with a maximum score of 70; initiative, 17 items 
and 51 points; self-assurance, 31 items, 48 points; 
supervisory qualities, 24 items and 54 points; occu- 
pational level, 20 items and 65 points, About half 
of the items of the inventory are scored on more 
than one key. The following instructions were 
given: 

“You will be asked to complete this inventory 
several times. Later in the period, I will answer 
questions about the procedures, but for the mo- 
ment I will ask you simply to listen to the instruc- 
tions and follow them as closely as possible.” 

Set I. “Read the instructions at the top of the 
page. I would like you to describe yourself as ac- 
sible using the pairs of adjectives in 


curately as pos 
the manner indicated.” Upon completion of each 


set, answer sheets were collected. 

Set II. “I would like you to assume that you are 
applying for a job in which you have some interest. 
As one part of the selection procedures, you are 
asked to complete this inventory. Assume that other 
test scores will be considered, along with your col- 
lege transcript, interview report, letters of recom- 
mendation, etc. The organization to which you are 
applying advertises that it is looking for young men 
with initiative. I realize that you are not in the 
situation described; I am asking you to imagine that 

and to act as you believe you would.” 
estion was asked, “Do you mean that we 
The answer given was, “I do not. 
f knowing what you would do in 


neering. 


you are, 
(The qu 
should cheat ?” 
I have no way 0° 


90 Robert E. Krug 


this situation. I am asking you to decide what you 
would do, and then to do it. Remember, this in- 
ventory is but one part of the selection apparatus.”) 

Set III. Same as II, with the substitution of “the 
organization is looking for intelligent people,” for 
“the organization advertises that it is looking for 
young men with initiative.” 

Set IV. Same as II and III, with the substitution 
of “the organization is looking for men with self- 
assurance.” Upon completion of Set IV, Ss were 
asked about their behavior on the three preceding 
administrations. This will be referred to later. 

Set V. “Considering all the people with whom 
you are well acquainted, select one person who, in 
your opinion, possesses the most initiative. Spend 
some time thinking about this, and decide on the 
person you would rank first among all acquaintances 
on this trait. I do not want a hy, pothetical possessor 
of a trait; I want a real person. Having selected 
the person, forget about the trait, and describe this 
person as accurately as you can.” 


Set VI. Same as V, but “most intelligent per- 
son” described. 
Set VII. Same as V and VI, but “most self-as- 


sured person” described. 


Analysis and Results 


All answer sheets were scored on each of 
three “relevant” keys (initiative, intelligence, 
self-assurance) and on an “irrelevant” key 
(occupational level). The resulting score 
matrices (46 Ss by seven sets) provided the 
basic data for analysis. Mean scores and 
standard deviations are presented in Table 1. 
Bartlett’s test indicated that variances were 
homogeneous, permitting an analysis of vari- 
ance. The analysis of variance is summa- 
rized in Table 2. All F’s in this table are 
significant at the .01 level. 


Since for each key the lowest score is ob- 
tained from “accurate self description,” sig- 
nificant increases are associated with some of 
the induced sets. Table 3 presents the values 
with which differences between means may 
be evaluated. Inspection of Table 1 in this 
light indicates that both classes of set (as- 
sumed selection and description of person 
possessing trait) produce such a shift. A set 
involving either intelligence or initiative re- 
sults in significant increases on all four keys: 


Discussion 


Mean scores on sets V, VI, and VII are 
essentially indirect indications of validity. 
For example, individuals seen by others as 
intelligent (Set VI) are described by adjec- 
tives which are weighted on the intelligence 
key. Asa group, these individuals also score 
high in intiative, self-assurance, and occupa- 
tional level. However, the correlation be- 
tween individuals on the initiative and intelli- 
gence keys for Set VI is only .236, which 
agrees with the coefficient reported by Ghi- 
selli (3, p. 17) of .227. One might agree that 
the underlying traits are probably related tO 
this extent. It is at least possible that over 
lap between keys is of no great concern, give? 
a set to produce valid descriptions. How 
ever, we should note that “most intelligent 
(Set VI) and “most initiative” (Set V) 7 
score higher on self-assurance than do “mos 
self-assured” Ss (Set VII). This is clearly 
not desirable if the inventory is to have diag- 
nostic significance and suggests the reductio” 


Table 1 


Means and Standard Deviations* 


Set 
Key I Ir mm y v vu wW 
Initiative M 29.02 3778 3465 29.22 34.29 3370 3200 
P S ; 6.3 74 5.4 6.6 5.8 5.8 
Intelligence M n 42.83 45.48 40.22 42.50 43.89 43.02 
fi 7, 6.1 8.1 5.9 7.2 i 5.9 
Self-Assurance M ot 29.26 30.00 25.93 27.63 Fa 26.52 
; é 5. CH) ei, ay 49 
Occupational Level M 33.89 43.09 39.67 3820 D ee a a 
E O E e, 73 82 
a Italicized numbers indicate the sets that are relevant for each key, A 


a 


Specific Selection Sets 91 
Table 2 
Summary of the Four Analyses of Variance 
Tnitiative Intelligence Self-Assurance Occupational Level 
s Mean Mean Mean 3 Mean 
ource df Square Square F Square F Square 
Sets 6 449.45 12.22 173.40 543 256.15 12.57 340.20 7.76 
Subjects 45 74.84 203 125.46 3.93 85.04 4.17 185.83 4.24 
Residual 270 36.79 31.96 20.37 43.85 


Note.—P for all F's <.01. 


of overlap between the self-assurance key and 
other keys, 

Of more concern in this investigation are 
the effects associated with the assumed selec- 
tion sets (II, III, IV). These may be stated 
briefly. First, when a set is established via 
R statement about employer objectives (i.e., 

the organization advertises that it seeks men 
with initiative”), scores increase significantly 
On a key related to the stated objective. This 
‘4 the usual demonstration of transparency; 
a one knows what the test attempts to as- 
ane he can influence his score. There is no 
aes relevant to transparency 1n the sense 
being able to infer what the test measures. 
hiselli’s data (3) suggest that this “seeing 

Tough” the test does not occur, and the 
Present study in no way challenges this. 

Second, for Sets II and III, the bias intro- 

aced generalizes to other keys. The pattern 
ate is identical with that found for the 
valid description” sets, but in this case the 
Eble would appear more serious. Re- 
Fardless of whether the generality is produced 
Y real correlation of the underlying traits, 
ion chanics of inventory and key pone 
S by the varying success which in 
uals meet in attempting to beat the test, 
„OPPortunity to score high on some key 
accident” is a weakness in any situation 
i e’s self ac- 
ly i 1 study (5) im 
Y is suspect. Heron's S$ bi way fe 
ns acquire 
uitously, for 


Aii Mge “pue » “men wi 

initiati „bright young men or ndently af- 

tiye Ve”; individuals may ee such in- 
iv 


Similar conclusions. 


formation, an applicant may obtain spurious 
scores on some key or keys. Again, as in 
Sets V, VI, and VII, the generality of score 
increases is not due entirely to high correla- 
tions between keys. Intelligence and initia- 
tive correlate .20, .43, and .42 for Sets II, 
III, and IV. 

Third, the similarity of results for the two 
classes of set employed suggests the presence 
of a general factor on which items and scales 
have varying loadings. The self-assurance 
scale presumably has less of this general fac- 
tor than have the initiative and intelligence 
This similar effect of the two classes 
d, of course, be produced by the Ss 
similar operations in the two cases, 
but there is evidence that this is not the case. 
The correlations between relevant sets for 
initiative, intelligence, and self-assurance are 
— .17, .12, and .08. In addition, the discus- 
sion following Set IV suggests that the typi- 
cal procedure for the selection sets was to 
choose the alternative which appeared rele- 
vant to the trait whenever one member of a 
pair was “obviously related” or when neither 
member of a pair was seen as good self-de- 
scription by an S. No S admitted describing 


some other real person in these sets. 


Considered together, the results suggest 


scales. 
of set coul 


employing 


Table 3 
Critical Values for Differences Between Means 
; Self- Occupa- 
Initia- Intelli- Assur- tional 
tive gence ance Level 
3 2.48 2.32 1.85 2.72 
A 5 O EE 3.50 


92 Robert E. Krug 


that the usual preference index is an insuffi- 
cient basis for building item pairs. Within 
the forced-choice format, two alternatives 
suggest themselves. One would be to equate 
items on both preference index and general 
factor loading, so that a choice could be 
treated in terms of the item’s specific vari- 
ance alone.* A similar purpose might be 
achieved by introducing a selection set into 
the establishment of the preference index, i.e., 
by obtaining judgments to a question like 
“how favorable would this term be as a de- 
scription of a job applicant?” As several 
writers (6, 7, 11) have indicated, many va- 
rieties of preference index are possible, An 
adequate index is one which controls for 
sources of error known to exist. Whatever 
the value of the alternatives advanced, it is 
at least conceivable that the selection situa- 
tion may produce special requirements which 
can be met by some type of preference index. 
It is also Possible, of course, that adequate 


control will be found outside the forced-choice 
approach. 


Summary and Conclusions 


1. Indirect evidence of validity is presented 
for three scales of the Ghiselli Self Descrip- 
tion Inventory. Persons viewed by others as 
possessing a trait in marked degree receive 


high scores on the scale designed to measure 
that trait. 


2. When a set is introduced which suggests 
that a company is “looking for men with 
- ;” scores on the trait named increase 
significantly. In this sense, the inventory is 
transparent. 


3. Bias introduced by a specific set gener- 
alizes to other scales in the inventory. This 
is a disturbing influence in use as a selection 
instrument, since it increases the number of 
potential sources of a high score. This gen- 


a 

3 In a personal communication, Robert J. Wherry 
suggests that items be matched on preference index, 
general factor loading, and discrimination 
while varying on group factor loading. He currently 
employs this procedure, thereby constructing scales 
which are purely diagnostic, ignoring level, 


erality is attributed to the presence of a gen- 
eral factor which is not controlled in the pair- 
ing of items, and in part to the overlap pro- 
duced by keying some items on more than 
one scale. 


4. It is suggested that preference index 
alone is an insufficient basis for constructing 
forced-choice pairs, if biasability is to be 
minimized. Suggested alternatives which re- 
tain the forced-choice approach include match- 
ing on general factor loading, or making the 
preference index specific to the selection situ- 
ation. 


Received April 22, 1957, 


References 


1. Ghiselli, E. E. The forced-choice technique 1n 
self description. Personnel Psychol, 1954, 1, 
201-208. f 

- Ghiselli, E. E. A scale for the measurement 0 
initiative. Personnel Psychol, 1955, 8, 157- 
164. 5 

3. Ghiselli, E. E. Manual for the self description 
inventory. Unpublished manuscript, Univer- 

California. n f 

4. Gordon, L. V., & Stapleton, E. S. Fakability o! 
a forced-choice personality test under reals i 
high school employment conditions. J. appl- 
Psychol., 1956, 40, 258-262. 

- Heron, A. The effects of real-life motivation 0” 
questionnaire response. J, appl. Psychol, 
1956, 40, 65-68. 

6. Highland, R. W., & Berkshire, J. R. A meth- 
odological study of forced-choice performance 
rating. USAF Hum. Resour. Res. Cent. Res 
Bull, No. 51-9, 1951, 

- Lanman, R. W., & Remmers, H. H. The “pref- 
erence” and “discrimination” indices in forced- 
choice scales. Educ. psychol. Measmt, 195% 
14, 541-551. axe 

- Longstaff, H. P., & Jurgensen, C. E. Fakability 
of the Jurgensen classification inventory. 

appl. Psychol., 1953, 37, 86-89. à 

9. Mais, R. D. Fakability of the classification in- 
ventory scored for self confidence. J. ap?! 
Psychol., 1951, 35, 172-174, 

10. Rusmore, J. T, Fakability of the Gordon per- 
Sonal profile. J. appl. Psychol, 1956, 40 
175-177. 

11. Wherry, R. J. Factor analysis of rating ite™ 


indices. AGO, PRS Technical Res. Rep, NO 
918, 1951. 


w 


wn 


Journal of Applied 


Psy. ; 
Vol. 42, No. 2. ea Pevchatory 


Limitations on the Use of Strong Sales Keys for Selection 
and Counseling 


J. L. Hughes and W. J. McNamara 


International Business Machines Corporation 


The available Strong sales keys are fre- 
quently used in selection and counseling for 
many different types of sales positions. Their 
more or less universal validity appears to be 
generally accepted, despite cautions about the 
need for validating them in any given situa- 
tion (9). Review of the literature, however, 
furnishes little support for the belief in the 
Seneral validity of Strong’s sales keys (4, 5). 
Aside from studies of casualty and life insur- 
ance salesmen (1, 2, 3, 7, 8) and a study of 
rg detergent salesmen (6), there is no evi- 
dence that the Strong sales keys are valid 
Predictors of success in other types of sales 
Positions. Further, the high intercorrelations 
(.82 to .84) among the sales keys (Life 
enue Salesman, Real Estate Salesman, 
ales Manager) indicate that they are essen- 
tially measuring highly similar sales interest 
Patterns (8). In view of the recognized dif- 
oo among various sales positions, the 
Pg evidence on validity suggests the pos- 
2 ility that the existing Strong sales keys 
s tid not be suitable for use in many sales 
election and counseling situations. ; 
A The present study provided an opportunity 

investigate this possibility. Earlier unre- 
Ported studies had developed two custom- 
oe Strong sales keys by an item analysis 
3 two different types of salesmen in the same 
OMpany, Cross-validation of these keys 1n- 

‘cated that they were effective in selection. 
¿e relationships of these two valid sales 
€Ys to each other and to three Strong sales 
ays (Life Insurance Salesman, Sales Man- 
ger, and Group IX) were then determined. 

e results indicated the similarities and dif- 
m ences in type of sales interest patterns 

€asured by these different keys- 


Description of Study 
two types of salesmen in this study were oo 
in the sale of accounting and data pron a 
MES, on, o retal baag (DP) ‘and. DE mes 


The 
Bageq 
Machi 


93 


electric typewriters (ET) for the same company. 
Since objective criteria of sales success were found 
to be unreliable (the r between first and second year 
production based on percentage of quota was — .11 
for 89 DP salesmen), the criterion used to measure 
sales success was survival. All men who completed 
a minimum of 18 months after being assigned a sales 
territory were considered successful, All who termi- 
nated before this time because of poor performance 
were considered unsuccessful. The criterion was the 
percentage of men who terminated for each Strong 
score level. 

The present study was primarily a follow-up on 
the validity of two custom-built sales interest keys 
for new samples of salesmen. Keys for DP and ET 
salesmen had been constructed previously by an 
item analysis of each group. The DP key contained 
199 item responses (128 items), the ET key 193 item 
responses (129 items). These were items with per- 
centage differences between successful and unsuccess- 
ful salesmen significant at the .10 level or better. 
After preliminary cross-validation of these keys, 
letter grades ranging from A to D were established 
and the keys were used by the company in selecting 
DP and ET salesmen. The DP and ET cross-valida- 
tion samples used in this study were men who had 
taken the Strong test before employment and had 
been selected partly on the basis of their Strong 
scores. Since the selection procedure discouraged 
the employment of low scoring applicants, the range 
of Strong scores for these samples was considerably 
restricted. 

The total DP cross-validation sample totaled 358 
men, the total ET sample, 220. The validity of the 
DP and ET keys was determined for these groups. 
Random samples of 140 DP and 100 ET salesmen 
were then selected from these groups and the re- 
mainder of the study was carried out on the smaller 
samples. This consisted of scoring them on three 
Strong sales keys (Life Insurance Salesman, Sales 
Manager, and Group IX) and correlating these three 
keys with the two custom-built keys (DP and ET). 


Results 


Table 1 shows the relationship between the 
letter grades for the custom-built DP key and 
ons from sales for the total cross- 
le of 358 DP salesmen. The 
d increased from 7% for 
The differences in per- 


terminati 
validation samp. 
percentage separate 
A’s to 31% for D’s. 


94 J. L. Hughes and W. J. McNamara 


Table 1 


Relationship Between Strong DP Sales Key and 
Terminations Due to Poor Performance 
for 358 DP Salesmen 


Company Terminations 
Letter Total 
Grade Sample No. % 
A 104 7 7 
B 118 10 8 
c 97 13 13 
D 39 12 31 
Total 358 42 


centage were significant at the .01 level by 
the chi-square test, 

Table 2 indicates the validity of the cus- 
tom-built ET key for the total cross-valida- 
tion sample of 220 ET salesmen. The per- 
.centage terminated increased from 16% for 
A’s to 52% for D’s. These differences in 
percentage were significant at .02 by chi- 
Square test. 

The restrictive effect of the selection stand- 
ards on the range of sales interest for 140 DP 
and 100 ET salesmen is shown in Table 3. 
On Strong’s three sales keys, few of the DP 
or ET salesmen employed scored below B +, 
the minimum letter grade generally consid- 
ered indicative of adequate interest in an oc- 
cupation (8). This restriction of range ex- 
plained why Strong’s sales keys had previ- 


ously been found not to be useful for selecting 
DP and ET salesmen, 


Table 4 gives the intercorr 


elations among 
the custom-built DP and ET keys and the 


Table 2 


Relationship Between Strong ET Sales Key and 
Terminations Due to Poor Performance 
for 220 ET Salesmen 


Company Terminations 
Letter Total - 
Grade Sample No. % 
A 55 9 16 
B 74 20 27 
Cc 58 18 31 
D 33 17 52 
Total 220 64 


Table 3 


Distribution of Letter Grades for Life Insurance Sales- 
man, Sales Manager, and Group IX (Sales) 
Strong Keys for 140 DP and 
100 ET Salesmen 


Life Insurance Sales Grove 
Strong Salesman Manager D 
Letter = 
Grade DP ET DP ET DP ET 
A 114 89 126 88 135 100 
B+ 16 8 10 12 1 = 
B 4 2 i = 2 
B- t 4 = 1 @ 
C 5 — Shes == 
Total 140 100 140 100 140 100 


Life Insurance Salesman, Sales Manager, ang 
Group IX (Sales) keys. The DP and E : 
keys were not significantly related. The cor 
relations of the DP and ET keys with E 
three standard Strong keys, however, were a 
significantly different from zero at the e 
level. The DP key was positively related : 
Strong’s sales keys, while the ET was nega 
tively related to them. 


Discussion 


The results indicated the validity of ti 
custom-built sales keys for DP and ET ole 
(Tables 1 and 2). In view of the rete 
of range in these samples caused by ex, 
these keys in selection, the true validity Ai 
undoubtedly higher than shown here. Sin 


Table 4 


, and 
Product-Moment Correlations Between Company ® 
Published Strong Keys for 140 DP and 
100 ET Salesmen 


ET 
140 DP 100 Fo 
Salesmen* Salesmer 

Key DP Key vs. ET Key 

ET —.13 P 

DP En =g, 

Life Insurance Salesmen 52 E 

Sales Manager 52 <4 

Group IX (Sales) Fai = 


ero 

7 Eor 138 df, an r of .17 is signif different from # 

by £ test ae the 0S level; +23 at the ‘On ace ; zero by 
or 98 df, an r of .20 is significa ifferent from 

t test at the 05 level; .25 at the Oley” üii 


Limitations on Strong Sales Keys 95 


earlier efforts to validate the standard Strong 
sales keys for these same positions in the 
company had failed, this study furnished ad- 
ditional evidence of the advantage of con- 
structing custom-built keys for given sales 
Positions when possible (9). In addition, 
other data not reported here showed that the 
DP key was not valid for ET sales, or the 
ET key for DP sales. Thus, even in the 
same company, the two sales positions were 
Sufficiently different to require separate sales 
interest keys. 

The dissimilarity of the DP and ET keys 
Was further demonstrated by the lack of a sig- 
Nificant relationship between them (Table 4). 
This finding appeared reasonable because of 
differences between DP and ET sales. DP 
Salesmen function mainly as accounting sys- 
tem consultants who analyze customers’ ac- 
counting problems and prepare technical pro- 
Posals on methods of improving their opera- 
tions by use of accounting machines. After 
Setting the order, they continue to provide 
Service and technical advice to their custom- 
ers during the rental period. DP salesmen 
> ten spend months in closing & contract and 
_1€lr average orders are frequently quite siz- 
able, The ET salesmen, on the other hand, 
pate differently. ‘Their product is less 
p chnical in nature, and their sales presenta- 
on is based on a demonstration of the op- 
stating features of the electric aig 

heir sales are generally made in less tim 
a fewer cai aed the size of their orders 
fe Usually smaller than those in DP. Asa 
jul, they spend more time in cold canvass- 
RS for new sales prospects than DP men. 
eet difference is the length of the train- 

8 Period. For DP, it covers approximately 
mocar and a half, while it is only 4 few 

Onths for ET. There are thus 4 number of 
“tly important differences between Laz ya 
int selling which could have caused the sa E 
cae patterns needed for succes? a E 
wp t0 differ. These differences would iat 
ay earlier company studies have found y 
thi key was not valid for ET sales a 

© ET key r lid for DP sales. 

he y not vali o; DP and ET 

ey Antercorrelations of the - eqn 

S with Strong’s sales keys raised SO! 


terest; = 
testing points, For DP salesmen, the cor 


relations of the valid DP key with Strong’s 
keys Group IX, Sales Manager, and Life In- 
surance Salesman were .51, .52, and .52, re- 
spectively (Table 4). The size of these cor- 
relations showed that the DP key was meas- 
uring sales interest patterns generally similar 
to those measured by Strong’s sales keys. 
DP salesmen thus appeared to be similar to 
Strong’s key standardization groups of life 
insurance salesmen and sales managers. De- 
spite this, however, Strong’s sales keys were 
not valid for DP sales, presumably because of 
the restricted range of the high sales interest 
scores of the DP salesmen (Table 3). 

The custom-built ET key, on the other 

hand, was negatively related to Strong’s keys 
(Table 4). The correlations ranged from 
— 34 for Sales Manager to — .48 for Life 
Insurance Salesman, and indicated that the 
ET sales interest pattern was quite different 
from the sales interest patterns measured by 
Strong’s sales keys. ET salesmen thus ap- 
peared to be a different type of salesman from 
the life insurance salesmen and sales man- 
agers used by Strong to construct his sales 
keys. 
These findings indicated the danger of using 
Strong’s sales keys in selecting applicants for 
a given sales job without prior validation. 
Scoring ET sales applicants on Strong’s sales 
keys would have classified them in a ranking 
negatively related to their ranking on the 
valid ET key. For example, successful ET 
salesmen would have tended to score low on 
the Life Insurance Salesman key and termi- 
nators high. 

A possible explanation of the negative re- 
lationships between the ET key and Strong’s 
sales keys may be the different sales ap- 
proaches used by ET salesmen and Strong’s 
standardization groups because of the nature 
of the product sold. The ET salesman usu- 
ally carries an electric typewriter with him on 
prospect calls, and tries to give a demonstra- 
tion of its operating features to his prospects. 
Thus, the ET sales presentation differs con- 
siderably from that used in selling life insur- 
ance, for example, where the sales arguments 
have to be made without reference to a tan- 


gible product. 


Since many other products are sold by 


96 J. L. Hughes and W. J. McNamara 


means of a similar tangible sales presenta- 
tion, the findings in this study raise doubts 
about the suitability of Strong’s sales keys 
for selecting applicants for this type of sales 
position. The validity of the standard Strong 
sales keys for different sales positions may 
thus be more limited than generally believed. 
This may partly explain why the literature 
contains little evidence of the validity of the 
Strong sales keys for other than casualty and 
life insurance sales. 

The results with the ET key further sug- 
gested that more caution might also be needed 
in using the Strong sales keys for counseling 
purposes. If the sales patterns measured by 
these keys are not suitable for all sales posi- 
tions, the use of these keys might present a 
misleading picture of the suitability of a 
counselee’s interest pattern for a number of 
sales jobs. In view of the widespread use of 
Strong’s sales keys in counseling, further in- 
vestigation of this area is needed, 


Summary 


Two custom-built Strong sales keys were 
validated on two different types of salesmen 
(N = 578) in the same company. Each key 
was valid only for the type of salesman used 
in the item analysis to construct the key. The 
absence of significant correlation between the 
two keys indicated the independence of the 
two sales interest patterns related to success 
in different sales positions in one company. 

One of the company keys was significantly 
positively related to three Strong sales keys 
(Life Insurance Salesman, Sales Manager, 


and Group IX), although the latter were not 
valid for salesmen in the company studied. 
The other key was significantly negatively 
related to Strong’s sales keys. Use of the 
Strong sales keys in selection for the latter 
sales position therefore would have resulted 
in serious misclassification of applicants. 
These findings suggested that valid sales 
interest patterns can be quite specific, and 
that the available Strong sales keys might 
give misleading results if used for selection 
and counseling purposes in some sales areas: 


Received April 29, 1957. 


References 


1. Bills, M. A. Relation of scores in Strong’s inter- 
est analysis blanks to success in selling E 
alty insurance. J. appl. Psychol, 1938, 22, 
97-104. ici 
2. Bills, M. A. Selection of casualty and life ma 
ance agents. J. appl. Psychol., 1941, 25, 6-10: 
3. Bills, M. A. A tool that has stood the test 0! 
time. In L. L. Thurstone (Ed.), Applicatio! 
of psychology. New York: Harper, 195% 
Pp. 131-137. 35- 
4. Cleveland, E. A. Sales personnel research, 19. 1 
1945: a review. Personnel Psychol, 1948, % 
211-230, Jec- 
- Husband, R. W. Techniques of salesmen Sè 
tion. Educ. psychol. Measmt, 1949, 9, 1 
148. les- 
6. Otis, J. L. Procedures for the selection of S% y- 
men for a detergent company. J. appl. 7% 
chol., 1941, 25, 30-40. uc- 
7. Schultz, R. S. Test selected salesmen are $ 
cessful. Personnel J., 1935, 14, 139-142. nd 
8. Strong, E. K. Vocational interests of men e, 
women. Stanford: Stanford Univer. P" 
1943. 


- Super, D. Appraising vocational fitness- 
York: Harper, 1949, 


n 


New 


ournal of Applied 


J 
P: 
Vol. 42, No. 2, ft aaa 


The Superiority of Gloved Operation of Small Control Knobs * 


William Leroy Jenkins 


Lehigh University 


Wearing a glove on the operating hand 
might be expected to affect adversely the 
smallest amount of rotary movement a sub- 
ject can make on a tactual-kinesthetic basis 
(mean least turn) and also the time required 
to make discrete settings on a linear scale. 
Data pertinent to these questions have been 
extracted from a study concerned with the 
influence of a variety of factors on mean least 
turn and on the time required to make set- 
tings on a linear scale (2). 


Apparatus 


by The apparatus for measuring mean least turn con- 
Sists of a control shaft mounted on ball bearings 
and provided with a mirror reflecting a beam of 
ight onto a scale, thus giving a magnified measure 
of any rotation of the shaft. 

s The linear scale apparatus has been previously de- 
riben (3). It permits measurement of the time 
equired to move a pointer by means of a control 
ae from the center of a linear scale to a pre- 
elected lighted insert. In the present study, inserts 
Ma in. right and left of center and inserts 4 in. right 
at left of center were used. The error tolerance of 
007 in, was determined by the width of the pointer 
118 in.) in relation to the width of the inserts 
(12s in.), The control ratio was such that the 
Pointer moved 1.18 in. for each complete turn of the 
Control knob. 


Procedure 
University students who 
In least turn meas- 


that he could not 
i little as 


wo iects were Lehigh 
© paid at the current rate. 


Ure: E 
Se, ments, S averted his head s0 


€ turn was recorded to the get poe 
In makin i the linear sca - 
g settings on 
tle S was given a ready signal, and @ preselesten 
unat was lighted. The S turned the control. emoh 
ee the pointer was within the limits of the ae 
frost and released the clutch. Time was = a 
fs the instant of lighting-UP of the ms ne: 
ant of clutch-release to the nearest hundre! 
Second, 
n from a larger 


o between the 


1 P 
This report is based on data take 
AF33 (616) -285¢ 


Study 
Asti 
Ul itu 


AE Air Research and 
"ght-Patterson Air Force Bases 


97 


Knob position, knob orientation, and knob diam- 
eter entered as independent variables in addition to 
the barehand-gloved factor. Each S made 40 least 
turns and 40 linear scale settings under each set of 
conditions and 16 to 20 Ss were used in each part. 

For gloved operation, S wore an MA-1 double fly- 
ing glove consisting of an inner woolen glove and an 
outer limp leather shell. Some least turn measure- 
ments were also taken with S wearing a stiff rubber- 
covered glove such as is used in handling corrosive 


chemicals. 
Results 


Figure 1 shows the percentage differences 
of gloved compared to barehand operation in 
relation to knob diameter. Each circle rep- 
resents the difference for a group of 16 to 20 
Ss under a particular set of conditions (knob 
positions or orientations). Solid circles indi- 
cate statistically significant differences. The 
small flag on two of the circles means that 
the stiff rubber-covered glove was used. All 
others involved the MA-1 double flying glove. 

With small knob diameters, gloved opera- 
tion is consistently superior. This is true not 
only for mean least turn but also for time to 


KNOB DIAMETER 
z 


{ E 
+ T 


LEAST TURN 


E.: -S | 
o 


o 
o 


-10 . 


eg no 


PER CENT DIFFERENCE OF GLOVED FROM BAREHAND 


ad 
LINEAR SCALE SETTING 
-iof . TOTAL TIME 4° DISTANCE 
sii r r 
o S 
3 ial a: 
& LINEAR SCALE SETTING 


TOTAL TIME 3/16" DISTANCE 


Gi 


Fic. 1. A 
as a function 0 


Mean least turn and time to make settings 
f barehand vs. gloved operation. 


98 William Leroy Jenkins 


make settings at both distances. With the 
larger knob diameters, the consistent superi- 
ority of gloved operation disappears. In fact, 
in mean least turn, there is a hint that bare- 
hand operation may be significantly superior 
under some conditions. 


Discussion 


As a part of a broader study of the effect 
of gloves on control operation time, Bradley 
(1) used a 14-in. knob in four horizontal po- 
sitions. His procedure required S to turn the 
knob 40° within a tolerance of 2° so that a 
light went out and remained out. This is 
roughly analogous to making a setting on the 
linear scale apparatus at 3°; in. distance (57° 
knob turn) to a tolerance of .007 in. (2° 
knob turn). In two positions mean time with 
MA-1 double flying glove was slightly shorter, 
in the other two positions slightly longer, than 
in barehand operation. The over-all differ- 
ence was not statistically significant. The 
1}-in. diameter in our study showed a sta- 
tistically significant difference in favor of 
gloved operation. 

The apparent paradox of better operation 
with gloves is not readily resolved. The 
smaller mean least turn taken alone might 
be explained in terms of slippage of the hand 
inside the glove; ie., if the glove surface 
moves less than the hand inside it, a smaller 
measured mean least turn would result for 


the same amount of actual turn. But this 
facile explanation fails utterly to show why 
times for making settings on a linear scale 
are shorter with gloved operation—shorter 
not only with settings at $ in. distance but 
also with settings at 4 in. distance requiring 
over three complete turns of the knob before 
the final adjustment can be made. No rea- 
sonable explanation of the shorter linear scale 
setting times seems to be forthcoming to date. 


Summary 


The least amount of turn on a tactual- 
kinesthetic basis and the time to make set- 
tings on a linear scale were studied in bare- 
hand operation and with MA-1 double flying 
glove. With small knobs, gloved operation 
was superior in both. With larger knobs, the 
superiority was lost. No ready explanation 
of the phenomena has been developed. 


Received May 3, 1957. 


References 


1. Bradley, J. V. Effect of gloves on control opera- 
tion time. WADC TR 56-532. 

2. Jenkins, W. L. Mean least turn and its relation 
to making settings on a linear scale, WAD 
TR-57-210. (ASTIA No. AD 118174), May 
1957, ri 

3. Jenkins, W. L, & Connor, M. B. Some ee 
factors in making settings on a linear scale- 
J. appl. Psychol., 1949, 33, 395-409, 


| 
| 


Journal > 
Vol. ue plied Lsschology 


Some Additional Data on the Relationships Between Expressed 
and Measured Values 


James B. Nickels 


University of Missouri 


and Guy A. Renzaglia 


Southern Illinois University * 


eal investigators have studied the rela- 
valu ips between expressed and measured 
and = through the use of self-value ratings 
Dur e original Study of Values (1). Their 
ea for conducting such research varied. 
ae (6) and Anderson (3) attempted to 
aii A more closely the apparent relation- 
cae or lack of relationships between pro- 
(obj (subjective) values and inventoried 
ar jective) values. Stanley (7) had a simi- 
o aloes but improved on previous meth- 
sa of obtaining and analyzing the raw data. 
aon and Allport (8) used the relationships 
en expressed and measured values as a 
em ewhat “unsatisfactory” indication of the 
oe validity of their original Study of 
selt (5 (SV). Finally, Fensterheim and Tres- 
eae ) studied the influence of expressed and 
Peo ve values on the perception of other 
ple, and so indirectly related expressed 
o a values to each other. 
fair] ough the findings of these studies agree 
of ga well, certain doubts about the adequacy 
Writers methodological designs prompted the 
tesearg to attempt a replication of previous 
arch. Some refinements introduced in the 


Pres, w 
ent investigation are as follows: s 
were obtained 


f the Allport- 
the SV. 

which more 
ring and scor- 
tilized. 

heets were 
nalyzed 


throy Se on measured values 
en the administration © 
n-Lindzey revision (2) of 
method of self-rating 
Pproximates the answé 
Tocedures of the SV was U 
Wo types of self-rating $ 
» and the data from both were @ 
Tately, 


close] 

ing ed a 
Used 
Sepa 


Method 


Tn 
tap strun 
dineq ents. Data on meas 


vance f 
Aue arepa the revised SV based on 


ured values were ob- 


Spra! 


l 


Th 

al e 

* the uaa were collected while both 
Iversity of Missouri. 


authors were 


99 


Data on expressed values were obtained from two 
sources: A definitional rating sheet (DR) and an oc- 
cupational rating sheet (OR). Both rating sheets 
were identical in direction and format except for 
their explanations of Spranger’s six value areas. Each 
value scale extended from “7” to “9” points with 
respective labels of “Jowest possible,” “extremely 
low,” “very low,” “low,” “average,” “high,” “very 
high,” “extremely high,” and “highest possible.” The 
directions were as follows: 

“Use the above Value Rating Sheet to indicate how 
much you value (prize) ‘theoretical,’ ‘economic,’ 
‘aesthetic,’ ‘social,’ ‘political,’ and ‘religious’ qualities 
for yourself. In other words, indicate the degrees 
to which you would like these six characteristics to 
describe you. Below are explanations of each value 
category. 

“In order to mar. 
cross out with an loa 
in each of the six colu: 


k your rating for each category, 
the most appropriate number 
mns. Be sure that all six rat- 
ings add up to 30. If the sum of the ‘numbers 
crossed out’ does not add up to 30, then alter the 


ratings without falsifying them so it will.” 
These instructions involve a method of answering 


and scoring which is similar to that inherent in the 
SV. In other words, forced-choice answers are re- 
quested, and the final results are expressed in terms 
of continuous and possibly tied measures. This lat- 
ter procedure seems to be an improvement over 
Stanley’s (7) method of having Ss rank themselves 


without ties on the value scales. 
The DR explanations for each value area were as 


follows: 


Theoretical. The theoretical man is interested pri- 


marily in the discovery of truth, ie, thinking & 
knowing. He most highly values being rational, 
logical, critical, and intellectual. a 

Economic. The economic man Is interested pri- 
marily in what is useful, ie. survival & efficiency. 
He most highly values being practical, industrious, 
businesslike, and wealthy. Ka 

Aesthetic. The aesthetic man is interested pri- 
marily in form and harmony, iê., beauty & loveli- 
t highly values being sensitive, expres- 


ness. He mos ans 3 
sive, artistic, and appreciative of attractive appear- 
ance. Da ae 

Social. The social man is interested primarily in 
the love of people, ie. sympathy & service. He most 
highly values being tender, kind, helpful, and un- 
selfish. 


100 


James B. Nickels and Guy A. Renzaglia 


Table 1 


Group Correlations Between Expressed and Measured Values 


Self-Ratings X SV Scores 


Self-Ratings X 
Self-Ratings 


emer Pintner Stanley DR OR z- x OR 

SV Scale N=48 N = 187 N= 066 N=76 7 
Theoretical .40** .31** .31** 45**  58** ae 
Economic Srs 39** 51° ge pm gan 
Aesthetic S57 .59** 51** 61** — .64** us 
Social —.06 14 15 SOE 58A Cor 
Political .44** —.02 .24* .60**  .66** ie 
Religious -69** .68** aL oo age 57 


a Vernon and Allport G 


P. 245) averaged five external ratings with the self-rating for each subject, 
* Significant at .05 level. 


** Significant at .01 level. 


Political. The political man is interested primarily 
in power, ie., might & control. He most highly 
values being strong, influential, authoritative, and 
renowned. 

Religious. The religious man is interested pri- 
marily in the unity of world outlook, i.e., ultimate 
belief & understanding. He most highly values be- 
ing mystical and comprehending life’s wholeness, final 
purpose, and deepest meaning. 


The OR explanations for 


each value area were as 
follows: 


Theoretical. The theoretical man would most like 
to be either a scientist, mathematician, or philosopher. 


Economic. The economic man would most like to 
be either a banker, businessman, or sales manager. 
Aesthetic. 


The aesthetic man would most like to 
be an artist in the field of either literature, painting, 
music, or architecture, 
Social. The social man would most like to be 
either a doctor, nurse, or social welfare worker, 
Political. The Political man would most like to be 


either a politician, civic leader, ‘or military com- 
mander. 
The rel; 


Religious. 
be either a clergym: 
church member. 


igious man would most like to 
an, religious worker, or active 


Subjects. Ss ranged in age from 18 to 49 with a 
median of 23. They were divided into two groups 
according to the administration sequences of the 
three instruments. Forty-five Ss (28 males and 17 
females) , comprising practically the total Population 

undergraduate psycholo; 


issouri made up Group 
instruments for 


MERT is 
that group was OR-DR-SV. The investigation 
based, therefore, on a total N of 76. 


Results 


: e 
Two major approaches to studying he 
relationships between expressed (subjecti is- 
and measured (objective) values are tain 
cernible in previous investigations. Cer ith 
results of this study will be compared v 
the findings of the other researchers hod 
subheadings characterizing these two met 
of investigation. , thod, 
Group consistency method. In this me an 
a correlation coefficient between expressed the 
measured values is calculated for each a a- 
six SV scales using all Ss in each comp 
tion. This procedure reveals the deat aie 
which a given group’s professed vali 1 
respond to its inventoried values. Ta tions 
reveals that the product-moment correla tan” 
obtained by the present investigators re d 
tially agree with the rank-order and pre in- 
moment correlations obtained by previo ica 
vestigators, except on the Social and Po con- 
scales in which present correlations are rte 
siderably higher. The correlations reP a 
by Pintner (6) and Stanley (7) appen i Alk 
similar, while those found by Vernon an eem” 
port (8) and the present investigators 5 Jat- 
ingly coincide. The one exception in the So- 
ter comparison is the correlation for the 
cial value. and 
Through the use of Z transformations sted 
two-tailed tests of significance as sugg? 


— 


Ld 


Expressed and Measured Values 


101 


Table 2 
The Relationships Between Z Transformed DR Intra-Individual Coefficients and Other Variables 


DR Intra-Individual Coefficients Transformed to Z 


Males Female: T 

Variable N=56 N= 20 N F anpa pee 
Age —.09 .03 —.04 —.li AS 
SD of SV 52°" 33 46** 40** 52** 
SD of DR a9 25 34** .20 51** 
Theoretical A 10 sane 4m 30 
Economic AZ —.35 08 —.09 .28 
Aesthetic —.21 16 —.13 —.18 —.05° 
Social .01 as 13 .26 —.01 
Political 19 —.31 ll AS -10 
Religious —.42** —.27 —.38** —.31* —.49** 


at Sienlhcant at OF level. 
by Edwards (4, pp. 131-132), no statistically 
significant difference was found between group 
consistency coefficients based on the DR and 
those based on the OR. 

Intra-individual consistency method. In 
this method a correlation coefficient between 
expressed and measured values is calculated 
for each subject using all six SV scales in each 
Computation. This procedure reveals the de- 
gree to which a given individual’s professed 
Values correspond to his inventoried values. 
he product-moment correlation coefficients 
etween DR markings and SV scores range 
from — .44 to .83 with a median 7 of 46, 
While the coefficients between OR markings 


and SV scores range from — .54 to .86 with 
a median r of .54. While Fensterheim and 
Tresselt (5) reported higher intra-individual 
coefficients—one third of theirs were above 
§2—their median of .66 is roughly similar to 
the medians obtained in this study. Stanley 
(7) found generally lower intra-individual co- 
efficients, but his range of —.81 to .98 is 
similar to Fensterheim and Tresselt’s results, 
and his median of .39 is perhaps within an 
equivalent range of all. 

Intra-individual consistency and other vari- 
ables, In order to test expressed-measured 
consistency as it relates to other variables, all 
intra-individual coefficients were correlated 


Table 3 


The Relationships Between Z Transfo 


rmed OR Intra-Individual Coefficients and Other Variables 


OR Intra-Individual Coefficients Tran 


sformed to Z 


Males Females Total Group A Group B 
= j= N=45 v= 
Variable N= 56 ae rae sist 
= —.18 —.27 AS 
Age HF A 
45** 35* sort 
SD of SV ag pe 29" 25 33 
SD of OR 26 i 4 15 ; 

A 34** = 93 .20 AS 19 
Theoretical 2 02 —12 16 
Economic -08 a : ‘ : 
ee 1 32 —.10 —.02 =.12 
S esthetic ‘Ol 33 .08 .23 —.08 

ocial si 01 10 —.04 

Aa 03 —.18 í 04 

Political : —11 = —.21 —.05 
Religious —13 š i i 


see 
rey Significant at .05 level. 
ignificant at .01 level. 


102 


Table 4 


Group Correlations Between Expressed and Measured 
Values for “High Religious” and 
“Low Religious’? Males 


DR Self-Ratings X 
SV Value Scores 


“High 


“Low 
Religious” Religious” 
Males* Males 
SV Scale N=14 N=14 
Theoretical -188 389 
Economic 408 .778** 
Aesthetic -107 7740" 
Social .097 -670** 
Political -360 .789** 
Religious .372 504 


* Range of SV Religious scores = 44-64, 
b Range of SV Religious scores = 15-32. 
** Significant at .01 level, 


with Ss’ respective ages, SV and DR (or OR) 
standard deviations, and SV value scores. 
However, the authors followed Stanley’s (7) 
use of Z transformations (4, pp. 126-127) to 
normalize the negatively skewed distribution 
of intra-individual coefficients. Tables 2 and 
3 give the product-moment correlation coeffi- 
cients by sex and group between the Z trans- 
formed intra-individual coefficients and the 
listed variables. 

Intra-individual coefficients fail to reveal a 
statistically significant relationship to chrono- 
logical age. This finding is somewhat at vari- 
ance with Anderson’s observation that “the 
results reflect a maturity factor in that the 
self-rankings of the older group were in 
closer agreement with their score ranks than 
were those of the younger group” (3, pp. 354— 
355). But the finding is in line with the 14 
coefficient reported by Stanley (7). The 
present data also parallel Stanley’s finding of 
a positive and statistically significant correla- 
tion between intra-individual Coefficients and 
SV standard deviations, This same relation- 
ship apparently holds to a lesser degree when 
DR and OR standard deviations are corre- 
lated with intra-individual Coefficients, 

Additional results show that the intra-in- 
dividual coefficients and the Theoretical scores 
for males are positively and significantly re- 
lated regardless of whether the DR or OR is 
used as the self-rating sheet. Also, the rela- 


James B. Nickels and Guy A. Renzaglia 


tionship between intra-individual coefficients 
and Social scores for females is positive and 
statistically significant when the DR is the 
self-rating sheet (.55), and approaches sig- 
nificance when the OR is the self-rating sheet 
(33): ; 

Interestingly, males with high agreemen 
between their expressed and measured vale 
tend to score lower on the Religious scale 0 
the SV than males with low agreement. T 
relationship is statistically significant only 
when the DR is the self-rating sheet, but a 
trend is also apparent even when the OR 4 
used. In view of this finding an additiona 
analysis was made. All male Ss were a&i- 
ranged according to their SV Religious scores: 
Table 4 reveals the product-moment group 
consistency coefficients for both the “high i 
ligious” (upper one fourth) males and “Jo 
religious” (lower one fourth) males. In Al 
six value areas the group consistency coe a 
cients for the “high religious” males a 
lower than those for the “low religious” ma ri 
However, through the use of Z transformi 
tions and two-tailed tests of significance 
suggested by Edwards (4, pp. 131-132), A 
two groups were found to differ significan 
only in the Aesthetic and Social areas 
level). 

A the information contained in Te 
2 and 3, subgroups were compared thron 
the use of Z transformations and tyo i 
tests of significance. In Table 2 males ale 
females differ significantly on the Social Pa e 
(.05 level) and approach significance 0D ©, 
Economic scale (.06 level), whey the 
Table 3 they differ significantly on bot 
Theoretical (.05 level) and Aesthetic 
level) scales. In other words, when eo 
sults from both tables are considered ee a 
intra-individual coefficients tend to neo” 
higher positive relationship to males Jes": 
retical and Economic scores than to fema 
And similarly, these same coefficients te” 
have a higher positive relationship t° to 
males’ Aesthetic and Social scores thana 
males’, Therefore, certain sex differe ow 
were revealed in the present study- a5 
ever, no statistically significant differenc® t 
found between intra-individual coeffi tbe 
based on the DR and those based °” 


Expressed and Measured Values 


OR. Similarly, no statistically significant 

difference was revealed between results based 

a Group A Ss and results based on Group B 
s. 


Discussion 


The results of this study suggest that a 
high positive relationship exists between ex- 
pressed and measured values for most stu- 
dents, but this relationship is not sufficiently 
high to make the two interchangeable. Nev- 
ertheless, for a few students expressed and 
measured values seem to reveal almost identi- 
cal results. Thus, the problem for future re- 
Search is to isolate those factors which may 
determine when the relationship between ex- 
Pressed and measured values will be near per- 
fect correspondence, mediocre similarity, or 
Complete reversal. In the present study three 
variables seem to be so related: variability on 
the SV, DR, and OR; religious emphasis on 
the SV; and sex differences on the SV. 

As mentioned previously, students with 
arge variation in their SV scores tend to have 
More similar expressed and measured values 
han those with small variation in their SV 
Scores, This same relationship holds to a 
esser degree for variability on the self-rating 
Sheets. Of course, these results are relevant 
Only if little or no “contamination” of data 
Occurs when standard deviations are Corre- 
ated with intra-individual coefficients based 
Partially on these standard deviations. The 
authors advance the interpretation that stu- 

ents’ response-sets to answer value-state- 
pants at a less variable level may reveal an 

Sensitivity to, an uncertainty in, or an Wie 
awareness of their value-systems. There- 
ha variability would indicate higher self- 
sel viy, greater self-certainty, OF clearer 

U-awareness. Additional research may give 
Vidence for the plausibility of this inter- 
Pretation, 

i Concerning religious emphasis, the mor® ae 

Stous male students are (according to oii 

eligious SV score), the less similarity oe 

to display between their expressed an 
onstted values. In addition, since sala 
ee coefficients of “high ae us” 
S are lower than those of “low ree 
“les (even though the differences are sta- 


103 


tistically significant in only two value areas), 
perhaps the former group is really less aware 
of its value-system and value-emphasis than 
the latter. But another possible interpreta- 
tion is that highly religious students may not 
represent a homogeneous population. A sub- 
group within highly religious students actu- 
ally may have almost identical expressed and 
measured values, but in the present study its 
members were in the minority. If such were 
the case, these students might be distinguish- 
able by their really being religious instead of 
just wanting to seem religious, or by their 
using religious zeal as an expression of per- 
sonal ideals instead of as a defense against 
personal problems. Further research may 
not only confirm the present finding but also 
may substantiate one of the above interpreta- 
tions. 

The significant and near significant sex dif- 
ferences in Tables 2 and 3 give rise to an 
interesting hypothesis. Men who score high 
on the Theoretical, Economic, and Political 
scales of the SV, the so-called “masculine” 
values (1, 2, 8), tend to have more similar 
expressed and measured values than those 
who score low on these scales. Likewise, this 
same trend is apparent for women who score 
high on the Aesthetic, Social, and Religious 
scales of the SV, the so-called “feminine” 
values. In fact, males and females who score 
high on values attributed to their own sex 
and score low on values attributed to the op- 
posite sex seemingly have closer expressed 
and measured values than individuals in the 
reversed situation. But since this interpreta- 
tion is only apparent as a trend in the data, 
further research utilizing different procedural 
and statistical analyses is indicated. 

An additional result is the lack of any 
statistically significant difference between the 
two self-rating instruments, even though the 
DR used abstract definitions and the OR 
concrete occupational titles. Nevertheless, as 
merely a general trend, the DR appears to 
be more sensitive to intra-individual consist- 
ency relationships, whereas the OR seems to 
be more sensitive to group consistency rela- 
tionships. Further research specifically in- 
vestigating these two types of self-rating 


sheets is necessary. 


104 


Even though the authors failed to find a sig- 
nificant relationship between intra-individual 
coefficients and chronological age, this failure 
may be due to inappropriate methodology. 
Perhaps chronological age is highly related to 
intra-individual coefficients when one person 
is compared at different times, but not when 
different people are compared at one time. 
Appropriate methodology would then require 
the study of the same Ss over a period of 
years. Future research of the longitudinal 
variety is therefore recommended. 

As a replication, the present investigation 
corroborates most of the findings reported in 
previous studies. Nevertheless, methodologi- 
cal refinements—particularly the administra- 
tion of the revised SV in conjunction with 
a sounder way of obtaining self-ratings of 
values—resulted in a marked increase in 
group consistency coefficients for the Social 
and Political scales. Apparently the restand- 
ardization of the Social scale on the revised 
SV brought the value-inventory more in line 
with the value-rating in regard to what they 
both “get at.” But since the increased cor- 
relation for the Political scale cannot be ex- 
plained entirely on the same basis, perhaps 
the generally higher relationships in this study 
may be due also to improved self-value-rat- 
ing-instruments, 


Summary 


This study, in the order of a combined 
replication of a number of earlier ones, intro- 
duced methodological improvements in the 
investigation of the relationships between ex- 
pressed and measured values. Through the 
administration of the revised Study of Values 
and two self-rating sheets (one using defini- 
tions of the six Study of Values scales, the 
other utilizing related Occupational titles), 
data on 76 Ss were obtained and analyzed for 
relationships. 

On the basis of group and most intra-indi- 
vidual correlations, Ss seem to have a rela- 


James B. Nickels and Guy A. Renzaglia 


tively significant awareness of their measured 
values. 
siderably in the similarity between their ex- 
pressed and measured values—from near per- 
fect correspondence to complete reversal. 

The analysis clearly points out that the 
more students vary in their scores on the 
Study of Values, the more similar their ex- 
pressed and measured values tend to be. To 
some extent this is also true for variability 
on the self-rating sheets. In addition, the 
higher male students prize religious values 
as measured by the Religious scale of the 
Study of Values, the less similar their ex- 
pressed and measured values tend to be. On 
the other hand, male and female students who 
score high on values attributed to their ow? 
sex and score low on values attributed to the 
opposite sex seemingly have closer expresse 
and measured values than students in the 
opposite situation. f 

Certain areas where future investigations 
might contribute were also noted. 


Received April 22, 1957. 


References 


1. Allport, G. W., & Vernon, P. E. A study a 
values. New York: Houghton Mifflin, 1931. 
2. Allport, G. W., Vernon, P. E., & Lindzey, G- "i 
study of values, (Rev. ed.) New Yor 
Houghton Mifflin, 1951. sus 
3. Anderson, Rose G. Subjective ranking bes 
score ranking of interest values. Perso” 
Psychol., 1948, 1, 349-355. cho- 
4. Edwards, A. L. Experimental design in psy 950. 
logical research. New York: Rinehart, kar 
5. Fensterheim, H., & Tresselt, M. E. The influe! es 
of value systems on the perception of peoP 
J. abnorm. soc. Psychol., 1953, 48, oe 
- Pintner, R. A comparison of interests, abi 
and attitudes. J. abnorm. soc. Psychol, 
27, 351-357, z 
7. Stanley, J. C. Insight into one’s own values- 
educ. Psychol., 1951, 42, 399-408. er- 
8. Vernon, P. E., & Allport, G. W. A test ir Js 
sonal values. J. abnorm. soc. Psychol, 
26, 231-248. 


Nevertheless, individuals vary con- , 


~~ 


sáo 


Journal oj Applies 
Vole ao, ye tped Psychology 


Differential Self-Perceptions of Management Personnel and 
Line Workers 


Lyman W. Porter 


University of California 


a oe Papers have stressed the importance 
ies aining information on self-perceptions 
lo ha increased understanding of the psycho- 
— aspects of industrial organization and 
liege (2, 4). Self perceptions are a 
traits a of both the individual’s enduring 
Wher and his present social roles. Thus, 
meer. Pe describes how he perceives 
ent so ? a must to some extent use his pres- 
which he environment asa reference point to 
een relate his own behavior and his 
Sn ratte traits. For any employed per- 
sn he work environment is an important 
t ent of his total social environment, and 
erefore, self-descriptions “would appear to 
a = Special interest since they provide cues 
in hie place the individual sees for himself 
Which organization and to the manner in 
he sees himself functioning” (4). 
ine raditionally, management personnel and 
in ie have been the two major groups 
Contr: Industrial organization that have been 
Q iss with each other in consideration 
tions. Personnel aspects of industrial opera- 
Self, For this reason, a comparison of the 
the Perceptions of management personnel vs. 
seem i Perceptions of line workers would 
Ds: €specially relevant for understanding the 
S ological problems of the work situation. 
age e importance of perception in labor-man- 
Ment relations has been suggested by a 
Dar of writers. A study of role percep- 
striki by Haire, for example, illustrates the 
ing effects of differences in perception of 
neutra] person when he is labeled as either 
Wheto Official or a management official, and 
ha the perceivers are either workers or 
cly ers of management (3). Haire con- 
Dresa from his data that “the general i 
Whe On of a person is radically differen 
an ve 8 seen as a member of management 
labor When he is seen as a representati 7 
?” and that “management and Jabor eac. 


Se a 
°S the other as less dependable than him- 


105 


self . . . less appreciative of the other’s po- 
sition than he himself is . . . [and] deficient 
in thinking, emotional characteristics, and in- 
terpersonal relations in comparison with him- 
self” (3, p. 211). Since self-perceptions are 
a factor in the role perceptions of others, as 
well as in the role perceptions of one’s own 
group, a study of how line workers view 
themselves as contrasted with how manage- 
ment personnel view themselves should help 
in interpreting the behavior of labor and 
management groups in wage negotiations and 
other labor relations situations. 

At the same time, such data should pro- 
vide a check on whether individuals in these 
groups see themselves in accordance with 
some of the traditional characteristics at- 
tributed to them by others and even by some 
members of their own group. In other words, 
it is relevant to ask whether line workers look 
at themselves the same way others are prone 
to think about them. A similar question can 
be asked in regard to management personnel. 

The purpose of this study is to compare the 
self-perceptions of line workers and manage- 
ment personnel employed in a variety of dif- 


ferent industries. 


Method and Procedure 


The instrument used in this study to obtain the 
self-descriptions was a 64-pair forced-choice adjec- 
tive check-list developed by Ghiselli and used in 
previous studies (1, 2, 4). Thirty-two of the pairs 
involve adjectives which are descriptive of different, 
but desirable, social traits. The S is asked to select 
the adjective in each pair that he believes best de- 
scribes himself. The other 32 pairs involve adjec- 
tives descriptive of socially undesirable traits, and 
S must choose the word in each pair that is least 
characteristic of himself. If individuals in two 
groups (¢.£- line workers and management person- 
nel) check the list, it is possible to discern descrip- 
tive patterns that distinguish the two groups. 

The self-description inventory was filled out by 
463 management personnel and 320 line workers, 
For the purpose of this study, “management person- 
nel” are defined as those who have any supervisory 


106 


Table 1 


Items Differentiating Management Personnel 
and Line Workers 


Management Personnel Line Workers 


See themselves as: See themselves as: 


inventive cooperative 
loyal dependable 
resourceful planful 
clear-thinking efficient 
sincere calm 
fair-minded thoughtful 
responsible reliable 
dignified civilized 
imaginative self-controlled 
logical adaptable 
Do not see themselves as: Do not see themselves as: 
immature quarrelsome 
affected moody 
cold stubborn 
infantile conceited 
intolerant nervous 
foolish careless 
weak selfish 
rude self-centered 
rattle-brained disorderly 
submissive fussy 
self-pitying hard-hearted 
cynical aggressive 
dissatisfied outspoken 
sly excitable 
irresponsible impatient 


duties. Thus, first line supervisors and foremen are 
included in the management group. All those who 
have no supervisory duties, ie., those in the lowest 
level of the organization, are classified as “line 
workers.” Both the management and line individu- 
als are from a number of different organizations of 
different sizes and located in widely scattered geo- 
graphical areas throughout the country, The inven- 
tory was administered ordinarily in connection with 
some sort of a personnel audit, rather than in a 
strictly research connection. 


Results 


The responses of the individuals in the two 
groups were analyzed for each of the 64 pairs 
of adjectives. Twenty-five, or slightly over 
one-third, of the items differentiated the two 
groups at the .05 level of confidence or better, 
Ten of the pairs were composed of favorable 
adjectives, and the other 15 pairs of unfavor- 
able adjectives that “least describe” the in- 


Lyman W. Porter 


dividual. The 25 differentiating pairs are 
presented in Table 1, where the responses of 
management personnel are given in the left- 
hand column, and those of the line workers 
in the right-hand column. It should be noted 
that when the results are presented in this 
manner, the differences are relative, and do 
not necessarily indicate that one adjective in 
a pair was favored by the majority of one 
group and the other adjective in the pair by 
a majority of the other group. A majority 
of both groups may have favored the same 
adjective in a pair, but one group favored it 
significantly more often than the other group. 
For example, for the first pair of adjectives 
listed in Table 1, a majority of both groups 
favored “cooperative,” but the line workers 
selected it relatively more often than manage- 
ment personnel, the percentages being 87.2 
for line and 78.6 for management. In other 
words, proportionately more line workers saW 
themselves as “cooperative” than did man- 
agement personnel, even though a majority of 
both groups would describe themselves as 
“cooperative.” It also must be noted that 
when a person checks one word in a pair 0” 
a forced choice inventory such as used in this 
study, he is not necessarily rejecting the 
other word. He is only indicating that one 
adjective is more or less descriptive of him 
than the other adjective. < 

The two lists given in Table 1 provide ev 
dence for reasonably well-integrated pictures 
of the two groups. Management personne’: 
as contrasted with line workers, perceive ° 
themselves in terms of leadership qualities- 
They picture themselves as possessing traits 
that are ordinarily associated with leadership 
behavior. They see themselves as strong 2” 
relatively dominant types of individuals 1” 
comparison with how members of the line ee 
themselves. The management personnel a 
scribe themselves as possessing a good degr? 
of initiative and independence of thought 4” 
action. Further rounding out the picture ° 
leadership qualities are the selection of sel! 
descriptive traits that imply maturity ane 
fairness. Throughout, the members of ma” 
agement relatively more often choose adjee” 
tives that describe themselves as being “fall 
minded,” “responsible,” and not being “i 


| 


Differential Self-Perceptions 


mature,” “intolerant,” “irresponsible,” and 
“infantile.” They look at themselves as be- 
ing the type of people who have qualities of 
leadership and who know how to exercise 
them appropriately. They also indicate a 
Concern with appearing as straight-forward 
People in their dealings with others. 

The self-descriptions of the line workers 
Present a contrasting picture to that of the 
Management self-descriptions. The line work- 
ers relatively more often checked adjectives 
that would place them more towards the “fol- 
lower” end of the leadership dimension. Es- 
Sentially, they see themselves as being ca- 
Pable, steady, and agreeable types of indi- 
viduals. The picture is one of “nice guys” 
Who can be depended upon, and who will not 
try to cause trouble. A number of the ad- 
Jectives that they chose relatively more often 
poan management closely fit together to ob- 
ain this picture: i.e., “cooperative,” “depend- 
able,” “reliable,” and “self-controlled,” and 
not “quarrelsome,” not “stubborn,” not “ag- 
Bressive,” and not “outspoken.” Socially, 
they , like management, view themselves as 
es concerned about the rights of others 
SNN of not appearing to be egocentric. In 
of parison to management personnel, fewer 
i „them picture themselves as likely to be 

ritable and easily upset. They seem to feel 
A EE they have a closer control over their 
Motions than do members of management. 
È Summary, the line workers describe them- 

Ves as even-tempered individuals who can 
in relied upon to perform quite adequately 

Cooperation with other individuals. 


Discussion 


hen the over-all self-perceptions of the 
Why 8toups are contrasted with each other, 

at seems to emerge is that management 
aa onne] have pictured themselves in a way 
li At closely fits a “leader” stereotype, while 
Ure Personnel give the complementary pi 
lar of a “follower” stereotype- The par i 
the CS Pairs of traits that distinguish betes? 
Call two groups concern items that a typi 
lead thought to be especially associated ¥ i 
thr ership and followership. Consisten y 
son Shout the 25 pairs, management a 

Nel select traits more nearly towards the 


107 


leadership end of the dimension, and line per- 
sonnel select traits that put them towards 
the followership end. 

The self-perception description of manage- 
ment people is one that might be expected if 
management’s role is conceived as primarily 
a role of leadership and direction. In one 
sense, the self-perception description of line 
workers is not necessarily that which might 
have been expected. In another sense, it is. 
If line workers are considered only in regard 
to their hierarchical position in the organiza- 
tion, they must of necessity be followers more 
often than management personnel simply by 
virtue of their position at the bottom of the _ 
organization. The results indicate that line 
workers perceive themselves in accordance 
with this expectation. On the other hand, if 
line workers are thought of as being members 
of “labor,” then the pattern of self-perception 
traits that emerges is not necessarily that 
which fits an often pictured stereotype of 
“militant labor.” There is little if anything 
in the self-description pattern of line person- 
as that pattern is contrasted with man- 
], that shows these workers 
s in any way except as rela- 
tive and nonaggressive 


nel, 
agement personne 
viewing themselve 
tively passive, coopera 


persons. To 
An implication of the foregoing is that line 


workers may be perceiving themselves much 
more in terms of their position in the organi- 
zation, rather than as members of a labor 
union or “workers” group. If this is so, it 
at the traditional picture of the 
worker would definitely have to 
organization role in re- 
lation to management, as well as his labor 
role vis à vis management. The results indi- 
cate, then, that if either management groups 
or union groups think of workers merely in 
terms of people who are out to oppose and 
fight management’s leadership at every point, 
their views are not consistent with how line 
personnel actually look at themselves. 

The results also raise some interesting 


s in connection with the recruitment of 


management personnel from line workers. If 
most of the people who tend to enter the or- 
ganization at the line level and stay there for 
a period of time come to view themselves in 


means th 
“typical” y 
take into account his 


point 


108 


terms of characteristics that are closely asso- 
ciated with followership rather than leader- 
ship, would they then function effectively if 
promoted to management supervisory jobs? 
A related question would be whether they 
would be satisfied in those jobs. As was 
pointed out earlier, self-descriptions of the 
type reported here are presumably deter- 
mined by the relatively enduring traits the 
person sees himself possessing and by the 
particular roles he sees himself fulfilling at 
the moment. To the extent that the self-per- 
ceptions represent long-lasting traits, the data 
would suggest that many line workers would 
not be suitable for mana: 
tions. To the extent the s 
a function of the particul 
ment, then the data wou 
swers to the questions, 
the data would imply that 
see their role of the mom 
necessarily indicate how t 
fit into new leadership rol 
agement positions, 


gement type posi- 
elf-descriptions are 
ar role of the mo- 
ld not provide an- 
In the latter case, 
line workers clearly 
ent, but would not 
hese workers would 
es required in man- 


Summary 

A 64-pair forced-ch 
was filled out by 4 
and 320 line worke 


oice adjective check-list 
63 management personnel 
ts. The responses of the 


Lyman W. Porter 


individuals in the two groups were analyzed 
for each of the pairs of adjectives, and it was 
found that 25 pairs differentiated the two 
groups at the .05 level of confidence or better. 
From these differentiating adjectives, inte- 
grated pictures of the self-perceptions of the 
two groups were developed. Management 
personnel tended more often to describe them- 
selves in terms of leadership-type traits, 
whereas line workers relatively more often 
pictured themselves in cooperative-follower 
terms. These findings were discussed as to 
their implications for understanding organiza- 
tional structure and functioning and for la- 
bor-management relations, 


Received May 27, 1957. 


References 
1. Ghiselli, E. E. The forced-choice technique A 
self-description. Personnel Psychol, 1954, h 
201-208. 
2. Ghiselli, 


E. E., & Barthol, R. Role perceptions 
of successful and unsuccessful supervisors. 
appl. Psychol., 1956, 40, 241-244. t 
3. Haire, M. Role Perceptions in Jabor-manageme? 
relations: an experimental approach, Indus- 
Labor Rel. Rev., 1955, 8, 204-216. R 
+ Porter, L. W., & Ghiselli, E. E, ‘The self percep- 
tions of top and middle management perso” 
nel. Personnel Psychol., 1957, 10, 397-406. 


_ 


Pes 


Journal of Appli 4 
Vol. 42, Nee fed BSvenologs 


Practice Effect on the Minnesota Clerical Test When 
Alternate Forms Are Used’ 


Howard P. Longstaff 


University of Minnesota 


and Leslie A. Beldo 
Cam pbell-Mithun, Inc., Minneapolis 


ae earlier study (1) a marked practice 
Cleri was demonstrated when the Minnesota 
k rical Test was administered successively 
ges of two to seven days. As pro- 
sible in the report of that study, one pos- 
feet means of overcoming such practice ef- 
level or at least to minimize it to tolerable 
h S, may be the use of alternate test forms. 
he purpose of the present study was to test 
= hypothesis with two alternate forms of 
t Minnesota Clerical Test, Form A, now on 
€ market, and Form B, not on the market. 


Procedure 


Pg October, November, and December, 1953, 
the iao oiber December, and January, 1954-55, 
Biven E forms of the Minnesota Clerical Test were 
ree ies 575 female applicants for clerical jobs at 
Were asians located in Minneapolis. Included 
150 at o applicants at the University of Minnesota, 
al tit major industrial plant, and 125 at a na- 
g ai insurance company. 

e ap he forms of the test w 
any jPPlicants in A B B A order to 
ie S effects from differences in 
these or other uncontrollable variation. Form A, 
then 4? Were given to the first applicant; Form B, 
ah the second: and so on from applicant to 


tion 
ere administered to 
counterbalance 
test difficulty, 


he : 
Dletj,. Cond form was given one minute after com- 


ban of the first. A longer, more natural time 
Was tween tests would have been desirable, but 
teryaj ite impractical to arrange. ‘A one-minute M- 
mayip t should be noted, very likely resulted in a 
um influence of practice effect. 


Results and Discussion 
bi Tab 
neq 
Nect 8roups of applicants. 
te Mber occur, it is apparent, 
Sst, the and Names tests. On 
a fference between means 
K is 
i ftom “esearch Sas madi possible by a g 
Mesota, © Graduate School of the Uni 


to trial is 7.7 points, significant at the .001 
level. The Names test shows a comparable 
difference in means of 7.8 points, likewise sig- 
nificant at the .001 level. The changes in 
centile ranks, representing practice effect, 
amount to about 10 centile points on em- 
ployed clerical worker norms. 

It is important to note that the practice 
effect for both number checking and name 
checking is less than one-third of the size of 
the standard deviations. Also, it is note- 
worthy that the standard deviations remain 
constant from Trial 1 to Trial 2 for both 
parts of the clerical test. 

Comparing practice effects from repeated 
administrations of alternate forms and identi- 
cal forms—the latter reported in the earlier 
study (1)—we find a smaller gain in score 
with alternate forms, supporting the hypothe- 
sis of the study, ie.; that alternate forms of 
the Minnesota Clerical Test would reduce 


practice effect. 


Table 1 


Practice Effects with Alternate Forms of the Minnesota 
Clerical Test ‘Administered to 575 Applicants 
for Clerical Work 


Number Checking Name Checking 
1 2 1 2 
M 120.8 128.5 119.3 127.1 
s 26.9 27.5 29.2 29.4 
De 47.7 +78 
io 11.3 12.2 
P .001 .001 
f .82 -86 
12 
Centile rank of 
mean scores . 
(employedcler 44 30 13 22 


cal workers) 


109 


110 


Table 2 


i i ith Identical and 
Comparison of Practice Effects with 
es Alternate Forms of the Minnesota 
Clerical Test 


Number Checking Name Checking 


1 2 1 2 
Mean Score 
Identical Forms* 137.0 157.9 142.7 170.8 
Alternate Forms 120.8 128.5 119.3 127.1 
Difference Between 
Mean Scores 
Identical Forms 20.9 28.1 
Alternate Forms 7.7 7.8 
Percentage Increase 
in Mean Scores _ 
Identical Forms 15.3% 19.6% 
Alternate Forms 64 6.5 


a Mean scores for first 


t and second administration of identical 
tests to 32 female psyc! 


hology students. 


Comparative practice effects from the two 
methods of test administration are shown in 
Table 2. 

Though the two study groups are not 
strictly comparable—one scoring consider- 
ably higher on the Minnesota Clerical Test 
than the other, there is a strong tendency 
for alternate forms to show less increase in 
mean score than identical forms, on both 
Numbers and Names tests. The increase in 
mean score on the Numbers test is 15.3% 
for identical forms, 6.4% for alternate forms. 
On the Names test, the increase is 19.6% for 
identical forms, 6.5% for alternate tests, 

In the practical terms of employee selec- 
tion, the amount of Practice effect found with 
alternate forms of the Minnesota Clerical Test 
seems, to the authors, to be tolerable. With 
increased time intervals between repeated 
testings, the practice effect Probably would 
be even less than that found here, when the 
tests were administered only a minute apart. 

The immediate practical 


implication would 
seem to be publication of additional forms of 


* Subjects in the first test were undergraduate and 
graduate Students and extension students, 
which tend to score high on the Minnesot: 
Test. Subjects in the second test were day-to-day 
applicants for clerical Positions. 


Howard P. Longstaff and Leslie A. Beldo 


the Minnesota Clerical Test to combat tng 
practice-inflated test scores of itinerant a 
plicants in everyday personnel situa 
Development of additional forms of the M 
nesota Clerical Test would be complica 
however, by what strongly appears to be 
substantial difference in form difficulty. a 

Quite unexpectedly, Form B proved to i 
more difficult (score lower) than Form /» 
particularly in Numbers: 


Names: 
Numbers: 


B <A by 2.5 points 
B <A by 5.9 points 


The differences in form difficulty and ie 
of significance are summarized in Table 
based on a special analysis of 293 subia 
from the total sample. Scores for each R 
in each form were combined irrespective i 
test order, justified by an analysis sho a 
difficulty effect to be independent of a $ 
For example, scores for the Names tes a 
subjects taking Form A before Form B i: 
those taking Form B before Form A V 
combined into a single set of scores. it is 

Compared with mean practice effects; for 
apparent that the difference in difficulty 
Numbers (5.9) is nearly equal to the di J 
ence in practice effect (7.7). Unaccounta ect 
and somewhat contrary to reasonable i 
tation, Names show a smaller aieri 
form difficulty (2.5) than Numbers (5- si 

Rationally, one would assume that the ou 
ple subject matter of the Numbers test hit 
inevitably yield comparable levels of diffi call 
Yet the difference, sizeable and statisti? i, 
significant, is there, explained by no imm 


Table 3 


Differences in Form Difficulty on the 
Minnesota Clerical Test 


Names Number n 
=. For" 
Form Form Form 
A B A fi 
29 

N 293 293 293 1209 
M 123.0 120.5 126.8 
Spit. 2.6 2.2 
De-a 25 5:9 
t 6 2.68 
P 30 Ol 


ae 


e e a 


Practice Effect on Minnesota Clerical Test 


ate reason other than the generalization that 
certain patterns of names and numbers in one 
form must be inherently more difficult than 
in the other. Which is to say that resolving 
the enigma of form inequality is a subject 
for further study in itself. 

Before alternate forms of the Minnesota 
Clerical Test could be put on the market, 
which is the practical hypothesis underlying 
this project, the disparity in difficulty level of 
the Numbers test would have to be resolved. 
Ignoring test difficulty and developing addi- 
tional test forms would only confound prac- 
tice and difficulty effects hopelessly and ren- 
der comparison of test scores with a single 
norm impossible. Empirically establishing 
difficulty levels and recalibrating tests would 
involve the formidable problems of testing 
large samples and equalizing test difficulty. 

One practical first step would seem to be a 
basic study of test content, particularly the 
Numbers test. Content analysis of number 
frequency and number patterns might reveal 
significant differences between forms account- 
ing for the variation in difficulty. Or, num- 
ber patterns and number frequencies could be 
varied experimentally to isolate the basic diffi- 
culty variables. Although this study was 
planned in anticipation of a practical recom- 
Mendation, there seems to be no alternative 


111 


now but to recommend a basic course of 
action before any development of additional 
forms of the Minnesota Clerical Test. 


Summary 


1. Repeated administrations of alternate 
forms of the Minnesota Clerical Test show 
less practice effect than repeated tests with 
identical forms. 

2. The degree of practice effect is a 6.4% 
increase in mean score on the Numbers test, 
6.5% on the Names test, when Forms A and 
B are administered in ABBA order. 

3. The immediate development of alternate 
test forms of the Minnesota Clerical Test is 
contra-recommended in view of the substan- 
tial differences in test diffculty on the Num- 
bers test. 

4. Any attempt to develop alternate forms 
of the Minnesota Clerical Test for the Num- 
bers test should be preceded by a basic study 
of test-content factors underlying inequalities 
in form diffculty. 


Received May 31, 1957. 


Reference 


1. Longstaff, H. P. Practice effect on the Minnesota 
Vocational Test for Clerical Workers. J. appl. 
Psychol., 1954, 38, 18-20. 


Journal of Applied Psychology 
Vol. 42, No. 2, 1958 


An Empirical Classification of Error Patterns in Receiving 
Morse Code ' 


Richard W. Highland 
Hughes Aircraft Co. 


and Edwin A. Fleishman 


Yale University 


One of the unanswered questions in code 
training concerns the best procedure for help- 
ing students eliminate their individual errors 
in receiving. It is well known that learning 
to receive some Morse Code characters is 
more difficult than learning to receive others, 
and that students have a strong tendency to 
confuse certain characters with certain other 
characters while some characters are seldom 
confused. It seems likely that a meaningful 
categorization of such errors might serve as 
a basis for improving training procedures in 
this area. 

There have been a number of previous at- 
tempts to classify such code error patterns. 
These categorizations have been largely sub- 
jective and involve certain untested prior as- 
sumptions. Seashore and Kurtz (3), for ex- 
ample, suggest the following classification: 

1. Errors involving the shortening of the 
signal. 

2. Errors involving the lengthening of the 
signal. 


3. Errors characterized 
stitution of dots for dash 
dots. 


4. Errors characterized by alteration of the 
elements within the signal. 


5. Miscellaneous. 


On the other hand, War Department Tech- 
nical Manual TM 11-459, “International 


Morse Code (Instructions) ” (6) suggests the 
existence of two types of errors, “dotting 
errors” and “copying too close,” The first 


+ This research was carried out while the writers 
Were with the Air Force Personnel and Training Re- 
search Center. The work was done under ARDC 
Project No. 7706 in Support of the research and de- 
velopment program of the Air Force Personne] and 
Training Research Center, Lackland Air Force Base, 

exas. Permission is granted for reproduction, trans- 
lation, publication, use and disposal in whole and in 
part by or for the U. S. Government. 


by complete sub- 
es and dashes for 


of these is said to occur at high-speed receiv- 
ing levels and consists of characters which 
differ from each other only in the number of 
dots contained in the signal. The error of 
“copying too close” is said to occur when 4 
student starts writing a response before he 
has heard all of the signal. The effect of | 
is to perceive a signal shorter than the signa 
sent. This corresponds to the first category 
suggested by Seashore and Kurtz. 

The purpose of the present study was t° 
organize code errors made by radio operato" 
trainees into meaningful categories derived em 
pirically from code error data. Specifically, 
the study: (a) determined the order of diffi 
culty of the most frequent substitution erroni 
(b) applied the techniques of factor analyst 
to the inter-correlations among these error 
and (c) examined the relationship betwee” 
the error factors obtained and the most m 
quently made errors. The primary objec 
then, was to group the most frequent error? 
into a small number of independent cat 
gories. The inference is that such categorie 
correspond in some way to underlying soure ‘ 
of difficulty contributing to a variety of C° 
substitution errors. 


Method 
Subjects 


The Ss in this study were Air Force radio ope 
tor trainees. These students had passed their fi 
checks for receiving code at a speed of six g" ta 
per minute and were learning to receive code # 
speed of eight groups per minute.? 


Data Collection Procedure 5 

» y 

The data collected were “substitution o E 
which the student’s response to an auditory 


4 A et 

signal consisted of writing down a number oF 3 

———— J 
2 i bas 
? The speed per minute was computed on the 


cter 
of groups consisting of five Morse Code charar 
(letters or numbers) sent in a randomized orde" 


‘0 
ter 


112 


y 


Error Patterns in Receiving Morse Code 113 
Table 1 
Ranking of Code Errors 
Number of of Sabse 
Rank Signal Sent Response Error Substitutions tutions® 

1 (6) == @)=- ++ 1,396 095 
2 1,064 075 
3 859 059 
4 821 058 
5 796 053 
% 753 .050 
7 518 .035 
8 404 027 
9 382 025 
10 333 022 
11 310 .021 
12 297 .020 
B 280 019 
14 218 .015 
15 217 015 
218 014 
T 207 014 
18 192 014 
io 211 014 
20 193 013 
21 189 .013 
a 186 .013 
ae 185 012 
A 169 .011 
150 -010 
25 78 .010 
26 147 .010 
27 136 .009 
28 136 .009 
29 135 009 
30 137 .009 
31 134 .009 
32 132 .009 
“4 126 -008 

j Total: 11,709 

. stimulus signal except the signal ---~-- was sent between 13,000 
^ Proportions haye been rounded from ar per iii j 


and 15,000 times. The signal 


diffe r letter which was actu- 
rent from the number 0 er example, write 


ally sent. Thi ight t 
. e student might; s 
the letter “g” when the letter «g” was sent (in this 


case the ; í 
student perceives -.- 2 i= 
An “error pair” consists of the signal sent, 


gether with the erroneous response to mi 
N the above example, the “error pair” is H- nd the 
the signals for 26 letters of the alphabet mere or 
Numbers 0-9 are sent, there are iene many 
1260 possible substitution errors- GF co highly un- 
ih these (e.g. substituting - ~~ fol): BREMEN 
ikely to occur. 

ata from a total of 80 


instead of --- ) 


7 radio operator students 


were collected, but this sample was reduced consid- 
erably. First, all code checks which fell below a 
level of 80% accuracy were discarded. Checks 
which fell below this level of accuracy typically in- 
cluded many omissions and instances where the stu- 
dent had lost his place in the code series. The 80% 
level was chosen in part because of its agreement 
with the procedure used by Seashore and Kurtz (3). 
Second, student records which included fewer than 


1.5 code checks per day were discarded.2 The com- 
Fea 

3 Students were dropped from the study if, after 
15 school days, they had failed to qualify at the 
eight groups-per-minute speed. 


114 


bination of these two criteria reduced the sample of 
students records used in the present analysis to 299. 


Tabulations and Analysis 


The student error records were tabulated in the 
following way: For each of the 299 students in our 
sample, a 36 X 36 table was prepared. The 36 rows 
of this table corresponded to the 36 Morse Code 
stimulus signals sent. The 36 columns of the table 
corresponded to the 36 possible erroneous reception 
responses which a student might make to each 
stimulus signal. Substitution errors were identified 
by collating the student’s code check results with a 


Richard W. Highland and Edwin A. Fleishman 


master key. The cells of the table permitted tabula- 
tion of the substitution error frequencies. 

Since the students in the sample had not all been 
exposed to the same code checks, the raw frequencies 
of the stimulus-response errors were converted into 
proportions. Thus, a student’s “score” for a given 
error was the proportion of times he had made that 
error. 

Table 1 lists, in order, the 34 most frequent sub- 
stitution errors obtained in the present study. Actual 
code signals sent and the code signal perceived in 
each instance are presented along with the propor- 
tion of times each error was made. These 34 pairs 


Table 2 


Final Rotated Matrix of Error Factors in Code Reception*® 


Stimulus and 
Response Code Pattern Factors 
(letters and 
o. numbers) Stimulus Response I w It w V 0 v h 
1 aay) es, a 19 45 07 -04 04 08 25 a 
2 H-S 06 55 —16 03 -04 19 11 38 
3 SAEI kemo —10 50. 08 09 13 2 19 3h 
4 Di Shweta 12 38 05 -12 24 -05 06 z 
5 1-J asia Sl 44 33 -07 27 25 39 86 
6 MG © BNeWe ces os 05 2 17 02 28 o2 02 15 
7 J-i sa o Mbesa 44°04 07 23 10 15 09 2 
8 4V - z =u 3 OF) mw æ i 28 gl 
9 v-4 - = —20 34 06 21 38 02 00 53 
10 S-H 24 34 05 03 09 21 —09 a 
11 J-w =- -- 60 09 O1 03 03 18 16 e 
12 7-Z -- -- 26 43 13 n -07 07 23 A 
A a -=-= -- 43 15 -10 29 10 00 17 = 
E ze she - o8 o 3 © 17 ~02 11 17 
E T - 14 00 31 —02 —01 20 —11 19 
i? = Gees ag Sy 144 02 25: -0 19 26 07 2 
A AA O 09 23 -01 30 08 21 —02 25 
5 ae W aa io AG 36 15 is) ath 02 28 
a Bee a ig 45 02 03. 48 dy —o7 B 17 
F TE Wa. 25 23 A7 -06 03 —13 —09 25 
a e a me 23 OS «= 41-01 -13 01 0 33 
= Ta 1% 14 is gs ibi is =i% 21 
A D f x oo 12 12 #414 2% 06 B 2 
ae ao e yliees > 21° 35 OS it -06 . 15 —06 13 
a <<. eo es i 19 -06 -17 2 o —09 10 
e7 TA J os 10 a7 22 08 o1 00 28 
A a e a es" 10 DL dt ys) as 17 
a: ei a 7 23 16 -04 04 is 21 32 
30 28 fe ze 11 33 22 ds -03 ~29 29 21 
A a $ H 42 21 14 -05 1s 9 —07 21 
= at 17 -03 40 08 06 -09 06 25 
A on 22 40! 06 ~03 =G6 — 17 —09 49 
34 V-H : 05) 22 05 is -3 P 31 
02 04 50 06 0g 24 =0F i 
* Rounded from three places and decimals omitted. 


< 


Error Patterns in Receiving Morse Code 


represent about half the total substitution errors 
made out of the 1,260 possible pairs. 

The 299 “scores” for each of these 34 error pairs 
were correlated with the scores for each of the other 
error pairs. The resulting correlation matrix was 
factor analyzed using the Thurstone Centroid Method 
(5). The rotations necessary to achieve orthogonal 
simple structure were accomplished using Zimmer- 
man’s procedure (7). 


Results 


Order of Difficulty of Error Pairs 


The extent to which the order of difficulty 
in Table 1 agrees with the results obtained by 
Seashore and Kurtz (3)* is indicated by the 
fact that 23 of the 34 variables used in the 
present study were among the 34 most fre- 
quent substitution errors identified by Sea- 


shore and Kurtz. 
Table 1 indicates that the most difficult 


error pair was 6-B (that is, ----- was per- 
ceived as -.--), This error was made an 
average of 9.5 times for each 100 times 
that the stimulus (-----) signal was trans- 
mitted. It should be noted that the total 
number of errors committed for a given 
stimulus signal was considerably larger than 
is indicated in Table 1. These figures only 
represent how often a particular substitution 
error was made; they do not include other 
Substitution errors for the same stimulus sig- 
nal (beyond the 34th most frequent) and 
they do not include omissions. 


Interpretation of Factors 
The final rotated matrix is presented in 


Table 25 Factor loadings of -30 and above 
— 


sented data for only 
urring substitution 
the comparison is 


4 Since Seashore and Kurtz pre: 
the three most frequently occ 
oe for each code character, 

gestive rather than exact. 4 3 

5 Two tables showing the correlation ari nad 

e centroid matrix have been deposited Doct 

meri ion Institute. cu- 
can Domenas ADI Auxiliary Publica- 


ti ; seati ice, Library of 
lons Project, Photoduplication Se AE ene 


riable 5 from the fac- 


i A T 
a interpretations, except in the case of the facto! 
n which it was most heavily loaded. 


115 
were considered significant in defining a 
factor. 

Factor I, Dash Estimation. This factor is 
confined to those signals which contain a 
number of dashes. In all of these signals, 
the dashes occur in a series either at the be- 
ginning or end of the signal or comprise the 
entire signal. In no case is there a dot inter- 
spersed within the series of dashes, either in 
the way the signal is sent or perceived. The 
error is always in estimating the correct num- 
ber of dashes in the signal. This error may 
be one of omission or addition; that is, the S 
may perceive one too many or one too few 
dashes, but he never “shortens” a dash to a 
dot or “lengthens” a dot to a dash. The 
number of dots in these signals is always per- 
ceived correctly. This factor is extremely 
well defined, as loadings drop off sharply 
after .42. Of the 34 error pairs included in 
the analysis, the six variables on this factor 
represent, without exception, the only errors 
of this type in the analysis. 


Code Pattern 


Frequency Error Response Factor 
Rank Pair Stimulus Error Loading 

11 J-W tees +o: 60 

5 1-J So 1 
19 0-M wos AS 

7 Ae! one ar 44 
13 8-Z = s 43 
30 Z-8 yt i ar 42 


Factor II, Dot Estimation. This factor in- 
volves signals consisting mainly of dots. The 
dots always come in a row either at the be- 
ginning or end of the signal or else the sig- 
nal consists only of dots. No dash ever sepa- 
rates the dot sequences either in the signal 
actually sent or in the signal as perceived. 
This appears to be the dot counterpart of 
Factor I. In this factor the error is always 
in estimating the correct number of dots ina 
series. No errors are made in estimating the 
number of dashes. The factor includes both 
overestimates and underestimates of the num- 
ber of dots in a series; however, it appears 
most strongly in errors of underestimation 
with signals containing a long series of dots. 
Apparently the factor does not extend to er- 
rors in which only two dots are sent and per- 


116 


ceived as one, as in Variable 33 (.. sent, . 
received) or Variable 28 (--.. sent, --. re- 
ceived). This factor is general to more vari- 
ables than any other and contributes to more 
of the most frequently occurring errors. Thus, 
the four most frequent errors in the sample 
are saturated with this factor. 


Code Pattern 


Frequency Error Response Factor 
Rank Pair Stimulus Error Loading 
2 H-S Aor ao 55 
3 5-H TEP anes .50 
1 6-B SEP diia 44 
12 A E stan 3 43 
32 S-I ESk et 40 
4 H-5 Jor ee 38 
8 4v seora sa 37 
24 B-D aKa win 35 
9 v-4 Sear SOS Mia 34 
10 S-H atta T 34 
29 Z-7 =- -=r .33 


Factor III, End-element Substitution. The 
stimulus characters loaded on this factor are 
of varied types (i.e., predominantly dots, pre- 
dominantly dashes and mixed elements), but 
the type of error made is completely consist- 
ent from character to character. In each 
case, an error of substitution is made and this 
error always occurs on the Zast element of the 
character sent. The only other variable rep- 
resenting an error of this type is Variable 16 
(....- sent, received). Variable 16 
has a loading on this factor of 25, next 
highest to the variables listed. Among the 
34 variables in the analysis, none but these 
involves final element substitution. The sub- 
stitution may be either a dot for a dash or 
a dash for a dot. The trainee, in these in- 
stances, always perceived the correct number 
of elements per character. 


Code Pattern 


Frequency Error Response Factor 
Rank Pair Stimulus Error Loading 
34 V-H tate sae -50 
21 Ro sar juk Al 
31 B-X cee sees 40 
18 Y-C aise audi -36 
15 H-V vee tee 31 
14 GY -*- ai 31 


Richard W. Highland and Edwin A. Fleishman 


Factor IV, Internal Error. This factor ex- 
tends to fewer variables than was the case 
with the previous factors, and interpretation, 
therefore, is not as secure. All the charac- 
ters sent consist of both dots and dashes. 
These occur as two series, dots followed by 
dashes or dashes followed by dots. The char- 
acters do not involve changes from dots to 
dashes and back to dots again or changes 
from dashes to dots and back to dashes again. 
The distinguishing features of three out of 
four of these variables is that an internal sub- 
stitution error is made. These three vari- 
ables involve the sending of five-element char- 
acters; the substitution error occurs precisely 
in the middle element. Further, the error oc- 
curs at the end of the initial dot or dash se- 
ries within the signal. It is to be noted, how- 
ever, that this is the only type of internal 
substitution error made among the 34 vari- 
ables in the analysis. It is not known, for 
example, if this factor would extend to a sig- 


nal in which ..-. is sent and .--. is re- 
ceived. 
Code Pattern 
Frequency Error Response Factor 
Rank Pair Stimulus Error Loading 
22 a re 45 
23 V-U einige iia Al 
27 ET ec eer AL 
17 3-2 eee disse 30 


Factor V, A Doublet. This is a doublet fac- 
tor consisting of one variable (V—4) in which 
<+- - Was sent and ....- was received and on 
variable (4-V) in which ....- was sent and 

“+= was received. It can be seen that the 
second of these variables is the inversion ° 
the first. Apparently this factor represents 
nothing more than some specific source ° 
difficulty restricted to these two signals. 

is surprising that more doublets of this tyP® 
(that is, those confined to reversals) did not 
appear in the analysis since 11 of the 34 sub 
stitution errors studied are reversals of 

other errors among these 34. Instead, both 
variables constituting an inverted pair ust- 
ally received their highest loadings on t j 
same factor, along with other error pairs: 
Apparently, the major source of difficulty 


Error Patterns in Receiving Morse Code 


underlying such inversion errors is general to 
other kinds of errors. 

Factors VI and VII. These are considered 
to be residual factors with no apparent psy- 
chological meaning. 


Discussion 


It should be pointed out that the four fac- 
tors identified are not to be considered the 
only factors in code error patterns. The pres- 
ent study was confined to the 34 most fre- 
quently made errors. For example, it is pos- 
sible that initial-element substitution errors 
may cluster together; this would be the coun- 
terpart to our end-element substitution. How- 
ever, initial-element substitutions did not oc- 
cur with sufficient frequency to be included 
in the present study. It is reasonable to as- 
sume that the four factors in the present 
study are of the most practical importance 
since they were identified from the most fre- 
quently made errors. 

; Seashore and Kurtz (3) list the number of 
times that the three most frequent substitu- 
tion errors for each of the 36 Morse Code 
characters were made during the second week 
of code training. Since this provides fre- 
quency information not available elsewhere 
on 108 common substitution errors, an at- 
tempt has been made to classify these errors 
in terms of the four factors found in the 
Present study. The classification was ac- 
Complished subjectively or on the basis of a 
factor loading from the present study where 
available. The results of this classification 
are presented in Table 3. It can be seen that 
not all of the 108 substitution errors were 
readily identifiable with our four factors. For 
this reason, a category of «AJl Other Errors 

has been included. : 

When classified in this way, it is clear that 
the factors of Dot Estimation and End-ele- 
ment Substitution account for the largest per- 
centage of errors in the Seashore-Kurtz data 
as well as in our own. ‘Although these two 
Categories contribute less than half of the 

08 stimulus-response pairs included in the 
€ashore-Kurtz list, they account for pd 
thirds (66.3%) of the total number of sub- 
Stitution errors made; thus, of the 9502 er 
torg mado for the 108 S-P palis; 62%% 0 


117 


Table 3 


Frequency of Several Kinds of Substitution Errors* 


Numberof Fre- Per Cent 
Erroneous quency of Total 
Category S-RPairs ofErrors Frequency 
Factor I 
Dash Estimation 11 801 8.4 
Factor II 
Dot Estimation 18 3,435 36.2 
Factor III 
End-element Sub- 
stitution Error 26 2,858 30.1 
Factor IV 
Internal Error 14 1,057 11.1 
All Other Errors 39 1,351 14.2 
Total 100 


aThe 108 substitution errors identified by Seashore and 
Karit (3) classified according to factors identified in the present 
study. 


these were errors of Dot Estimation or End- 
element substitution. The remaining two 
categories derived from our factor analysis, 
Internal Error and Dash Estimation, con- 
tribute a combined total of 19.5% of the to- 
tal errors. Thus, our four factors can be 
used to account for 85.8% of the errors in 
the Seashore-Kurtz list.° 

It is of special interest to note that the 
factors identified do not agree with the error 
categories arrived at subjectively by other 
investigators (see above). However, as we 
have shown, these factors derived empirically 
from the error intercorrelations do simplify 
the description of the same phenomena in 
terms of fewer orthogonal categories. A good 
example is the prior categorization of errors 
into those of “shortening” the stimulus char- 


acter (e.g «++ sent and ... perceived), and 
those of lengthening (e.g., --- sent, .... per- 
ceived). Our results show that errors of 


lengthening and errors of shortening may fall 
within either the Dot Estimation or Dash 
Estimation factors. The critical feature con- 
tributing to difficulty is not “lengthening” or 
———— 

elf we classify subjectively the remaining 14.2% 
of the errors, we find that 7.2% of these represent 
a dot-dash inversion (eg, -**"* sent; 
received) and 2.9% initial-element substitutions. 
Whether these would eventually turn out to be em- 
pirical factor categories is not known. 


118 


“shortening” but rather incorrect perception 
of the number of dots or dashes in series. 

This is related to the finding, pointed out 
earlier, that inverted error pairs appear within 
the same factor. In this regard, both our 
findings and those of Seashore and Kurtz in- 
dicate that inversion errors of “shortening” 
occur more frequently than errors of “length- 
ening” involving the same pair; for example, 
.---- sent and .--- received occurs more 
frequently than .--- sent and .____ re- 
ceived. This kind of result taken together 
with the finding that the same factor con- 
tributes to both errors, indicates that re- 
sponses of “lengthening” and “shortening” 
do not differ in kind, but do differ in their 
probability of occurrence. The response tend- 
ency for “shortening” is ordinarily greater 
than the response tendency for “lengthening.” 

It should be stressed that we have made 
no attempt to relate our “factors” to under- 
lying perceptual processes. These factor cate- 
gories represent descriptive labels based on 
the physical descriptions of the errors made. 
Relationships to more fundamental processes 
may be of theoretical interest and worthy of 
further investigation. A step in this direction 
has been taken in a recent study by Fleish- 
man, Roberts and Friedman (2). 


Implications for Training Methods 


In Air Force training, radio operator stu- 
dents are presented each of the 36 Morse 
Code characters with approximately equal 
frequency. Seashore and Kurtz (3) recom- 
mended that training emphasize those char- 
acters most frequently confused with each 
other.” The present results suggest the ad- 
ditional step of diagnosing students’ difficul- 
ties in terms of the four underlying “sources” 
identifed and providing remedial training us- 
ing materials which emphasize these. Since 
the factors are orthogonal, diagnosis for in- 
dividual students may show difficulties char- 
acteristic of only one factor, or of combina- 
tions of factors. It is possible that one source 
of diffculty may yield to remedial training 
while another may not. The present cate- 

7 At least one experiment (4) has shown no ad- 


vantage of this approach in paired associate (Code 
signal-letter) code learning. 


Richard W. Highland and Edwin A. Fleishman 


gorization provides a basis for experimental 
research on this problem. 


Implications for Selection 


In the Air Force, and in other military 
branches, students are selected for training on 
the basis of testing procedures which include 
at least one aural code aptitude test. The 
most valid single code test has been found to 
be one which requires the learning of actual 
Morse Code signals, although over-all predic- 
tion can be increased by adding to this test 
other aural tests not involving actual Code 
(1). It is possible that the validity of code 
aptitude tests can be increased still further if 
such a test were constructed around the four 
factors identified in the present study. At 
least it would seem reasonable to include 4 
representation of items sampling these fac- 
tors. Our findings also present an interest- 
ing hypothesis worthy of experimental test. 
Since two of the factors, Dot Estimation and 
End-element Substitution, contribute most of 
the errors that students made in training, it 
is possible that a code aptitude test empha- 
sizing these factors might prove particularly 
predictive of especially high levels of code 
proficiency. In other terms, such measures 
may be more predictive of final asymptotes 
in code learning than present measures. 


Implications for Proficiency Measurement 


Implications here are analogous to those 
presented above. A current procedure for 
evaluating a trainee’s level is the “code 
check” during which the trainee receives 4 
random series of signal “groups” (see above) 
at a given code speed. It is likely that these 
are quite comparable from student to stu- 
dent, or class to class, etc. However, it ÍS 
Possible that consideration might be given tO 
teapportioning the relative frequency of sig- 
nals sent within the above factor categories: 


Summary 


A factor analysis of correlations among the 
most frequent substitution errors in receiving 
International Morse Code was performed. 
Four principal factors were identified: Das? 
Estimation, Dot Estimation, End-element 
Substitution, and Internal Error, These Te 


mene 4 


s 


— 


Error Patterns in Receiving Morse Code 


sults differ somewhat from previous cate- 
gories arrived at subjectively. The Dot Esti- 
mation factor was general to more different 
types of substitution errors and contributed 
to the most frequently occurring errors. 


Received June 24, 1957. 


References 


1. Fleishman, E. A. Predicting code proficiency of 
radio telegraphers by means of aural tests. 
J. appl. Psychol., 1955, 39, 150-155. 

2. Fleishman, E. A., Roberts, M. M. & Friedman, 
M. P. A factor analysis of aptitude and pro- 
ficiency measures in radio-telegraphy. J. appl. 
Psychol., 1958, 42, 129-135. 


119 


3. Seashore, H. G., & Kurtz, A. K. Analysis of er- 
rors in copying code. Selection and training 
of radio code operators—report No. 8. The 
Psychological Corporation, Aug. 12, 1944. 

. Sidman, M., Keller, F. S., Kennedy, E. J., & Wil- 
son, M. P. Teaching Morse Code reception 
with signals weighted in frequency according 
to their difficulty. J. appl. Psychol., 1955, 


> 


n 


39, 1-4. 


. Thurstone, L. L. Multiple factor analysis. Chi- 
cago: Univer. of Chicago Press, 1947. 


. War Department. War Department 


technical 


manual TM 11-459 International Morse Code 
(Instructions). Washington: Author, 1945. 
. Zimmerman, W. S. A simple graphical method 


for orthogonal rotation of axes. 
metrika, 1946, 11, 51-55. 


Psycho- 


Journal of Applied Psychology 
Vol. 42, No. 2, 1958 


A Change in a Product Image 


William D. Wells, Fedele J. Goi, and Stuart Seader 


Rutgers University, Newark Colleges 


Advertising campaigns are often designed 
to make a specific change in the reputation 
of a product or a brand. Tea advertising, for 
example, is currently stressing the idea that 
tea is “brisk,” “robust,” and “hearty,” in an 
effort to counteract the impression that tea is 
a namby-pamby drink. Parliament Cigarette 
advertising is another example. For years, 
Parliaments were characterized by a premium 
price, an odd package, and advertising which 
leaned heavily on snobbery. Now, Parlia- 
ments have a “new low price,” a standard 
package, and advertising designed to show 
that Parliaments may be smoked safely on 
either side of the tracks. 

The objective of this kind of advertising 
is to change the product “image” or product 
“personality” in such a way that the product 
itself will have wider appeal. It is an impor- 
tant objective because product images are 
known to have a strong influence on sales. 

An earlier report (2) presented an Adjec- 
tive Check List designed for survey use in 
the study of product images. It also pre- 
sented, as specimen results, the images asso- 
ciated with Cadillac, Buick, Chevrolet, Ford, 
and Plymouth automobiles among a group 
of male college students. The earlier report 
was based on data gathered during October, 
1956, just before the introduction of the 1957 
models. The present report is based on data 
gathered from approximately the same group 
of respondents six months later. The con- 
trast between the two sets of results provides 
an illustration of the impact of the new mod- 
els and their attendant publicity upon the 
1956 brand images. 

The earlier report contained a detailed dis- 
cussion of the Adjective Check List itself, and 
of the rationale behind the forced choice de- 
sign of the questionnaire. Therefore, pro- 
cedure for the present report will be sketched 
only briefly here. 


Procedure 


The Adjective Check List used in both studies con- 
tained 108 trait names, selected for frequency of use 


and appropriateness to the problem of measuring 
product images. Respondents were asked to indicate 
first whether each trait was most typical of Cadillac 
Owners, Buick Owners, or Chevrolet Owners; then 
whether each trait was most typical of Chevrolet 
Owners, Ford Owners, or Plymouth Owners. Trait 
names and car names were systematically rearranged 
to avoid position biases. A 

The respondents in the 1956 survey were given no 
special instructions about model year, so it is e 
possible to be absolutely sure of the frame of refer- 
ence they used in answering. However, it seems 
fairly safe to assume that most of them answered in 
terms of the general run of cars then on the road. 
In the 1957 survey, the frame of reference was rae 
fied. Each questionnaire was plainly marked “195 
models.” In both 1956 and 1957, respondents ‘were 
100 fraternity members from Rutgers University: 
Newark Colleges. The overlap between the two 
groups was about 70%. 


Results and Discussion 


It is obvious that results obtained from oo 
college students can not be thought of a 
characteristic of the consumer ogee 
However, the changes which occurred wit te 
this limited population provide a “test yer? 
demonstration that the introduction of ree 
model car, reinforced by a heavy ere 
investment, can produce a marked attitu 
change in a relatively short time. a 

Space limitations prevent Tepo IROTT 
complete tables of results,’ but lists of oat 
traits most associated with the various et 
owners will show the nature of the brand nei 
ages. In the lists which follow, trait nam! 
are ordered by frequency of mention. a 
frequencies exceed change expectations at 
-01 level or beyond. data 

Cadillac-Buick-Chevrolet. The 1956 da 

ch 

*A complete tabulation of the responses to Scan 
adjective has been deposited with the Ame o4 
Documentation Institute. Order Document No- to- 
from the ADI Auxiliary Publications Project, Pag" 
duplication Service, Library of Congress, Was! iero- 
ton 25, D. C. remitting in advance $1.25 for m ple 
film or $1.25 for photocopies. Make checks payin- 
to Chief, Photoduplication Service, Library of C°”; 
gress. A mimeographed copy of the table may A 
obtained free of charge by writing to William “1 


SATE r 
Wells, Rutgers University, Newark Colleges, New? 
2, New Jersey. 


120 


Change in Product Image 


showed the Cadillac-Buick-Chevrolet compari- 
son to be clearly stratified along class lines. 
The Cadillac Owner was called rich, high- 
class, famous, important, fancy, proud, su- 
perior, and successful. The Buick Owner was 
called middle-class, brave, masculine, strong, 
modern, and pleasant. And the Chevrolet 
Owner was called poor, low-class, ordinary, 
plain, simple, practical, common, average, 
cheap, thin, little, friendly, and small. With 
a variable as powerful as price differential 
operating, and with no major changes in 
either the automobiles or their advertising, it 
is not surprising to find that the 1957 data 
show exactly the same trend. In 1957, The 
Cadillac Owner was still called rich, high- 
class, famous, important, etc.; The Buick 
Owner was called middle-class, masculine, 
rough, calm, and independent; and The 
Chevrolet Owner was called plain, average, 
Poor, simple, little, practical, low-class, etc. 
A few of the differences between the 1956 
data and the 1957 data were statistically sig- 
nificant, but the differences were minor in 
size and import compared with the differences 
which occurred within the “low priced three.” 
Ford-Plymouth-Chevrolet. In the 1956 
data, The Ford Owner appeared youthful and 
dashing. The traits most often ascribed to 
1m were: masculine, young, powerful, good- 
looking, rough, dangerous, strong, single, 
Merry, loud, active, cool, tall, interesting, 
Sharp, and popular. In 1957, the traits most 
Often ascribed to The Ford Owner were: dan- 
8erous, loud, rough, powerful, cross, thin, ac- 
tive, proud, and brave. This image seems 
Tougher, tougher, and less debonair. 
In 1956, the image of The Plymouth 
ner was not unfavorable, but it was defi- 
Nitely stodgy. The Plymouth Owner was 
Called quiet, careful, slow, silent, moral, fat, 
Bentle, calm, sad, thinking, patient, honest, 
Understanding, and content. Against this im- 
age, the Chrysler Corporation laid a radically 
Changed automobile and high-style advertis- 
‘Dg campaign which announced that the new 
Plymouth was not one, but three full years 
head. “In one flaming moment,” oa 
tory copy said, “Plymouth leaps three 1u 
Ye 4 : dares to break 
ars ahead—the only car that 2N 
e time barrier! Plymouth’s traditionally 


121 


great engineering brings you the fabulous 
new Fury ‘301’ V-8 engine . . . revolution- 
ary new Torsion-Aire ride . . . exhilerating 
sports car handling ... dramatic Flight- 
Sweep Styling. The car you might have ex- 
pected in 1960 is at your dealer’s now!” (1, 
pp. 18-19). It is hard to believe that the 
quiet, careful, slow owner of an old Plymouth 
would dare go near a car with a Fury engine. 

The net effect of the new model and the 
high-style advertising was to shatter the old 
Plymouth image completely—at least as far 
as the present respondents were concerned. 
Of the 14 significant adjectives in the old 
image, not one remained in 1957. The 1960 
Plymouth image consisted of six words: high- 
class, feminine, important, rich, different, and 
particular. 

In the 1956 data, the Chevrolet portion of 
the Ford-Plymouth-Chevrolet comparison was 
somewhat nondescript. Ordinary, fair, and 
common were the only adjectives significantly 
above chance at the .01 level. In 1957, the 
product image was still something less than 
exciting. The Chevrolet Owner was called 
small, low-class, little, simple, ordinary, and 
practical. It is worth noting that 1957 
Chevrolet advertising persistently billed the 
new Chevrolet as “sweet, smooth, and sassy.” 
Without the backing of a dramatically 
changed automobile, this advertising seems to 
have made little impression upon the mun- 
dane practicality of the good old Chevrolet. 


Summary 


A previous report described the personality 
stereotypes associated with five well-known 
automobiles by a group of college student 
respondents. The present report shows the 
changes in these stereotypes which resulted 
from the introduction and promotion of the 


1957 models. 
Received June 28, 1957. 
References 


1. Chevrolet advertisement in Life, 1956, 41, No. 19, 


18-19. ya 
2. Wells, W. Ds Andriuli, F. J. Goi, F. J, & 
Seader, S. An adjective check list for the 


study of “product personality.” J. appl. Psy- 
chol, 1957, 41, 317-319. 


Journal of Applied Psychology 
Vol. 42, No. 2, 1958 


A Study of Occupational Stereotypes * 


K. F. Walker 


University of Western Australia 


Although the concept of stereotype has 
been fruitfully applied in social psychology, 
particularly in studies of public opinion and 
political and international attitudes, occupa- 
tional stereotypes have received little atten- 
tion. So far their empirical investigation has 
been confined to a few studies of stereotypes 
relevant to industrial relations. 

Haire and Grunes (4) adopted Asch’s tech- 
nique (1) of studying the way in which the 
perception of a single personality variable in- 
fluences the perception of other aspects of the 
personality, and found that perception of a 
person in the role of a factory worker modi- 
fied various other aspects of the observer’s 
view of the person. Haire (3) asked labor 
union and managerial personnel to check the 
adjectives they considered applicable to men 
shown in photographs with accompanying 
personality descriptions. He found marked 
differences in their responses according to 
whether the men were labeled as union offi- 
cials or plant managers. In another investi- 
gation reported in the same paper he found 
characteristic differences in the types of 
words and phrases used by labor and man- 
agement representatives in collective bargain- 
ing conferences. An earlier unpublished in- 
vestigation by Haire and Morrison (5) 
showed differences in the perception of labor 
and management personnel by children of 
higher and lower income groups. 

Stagner (8) had college students check 
adjectives they considered applicable to the 
average business executive and the average 
worker. They were also asked to check the 
adjectives they considered “good” and those 
they thought applied to themselves. Although 
many traits were ascribed to executives and 
workers in approximately equal frequency, 
the students saw definite differences between 
the two groups on a number of traits (rank 


1 This study was supported by a grant from the 
Carnegie Corporation of New York which, however, 
is in no way responsible for any statement made. 


difference correlation + 44). Pro-labor and 
anti-labor students differed in their percep- 
tions of the traits of both groups, the differ- 
ence being greater in the traits ascribed to 
executives. Pro-labor students attributed to 
themselves most of the traits they ascribed to 
workers, and regarded these traits as good, 
but they rejected the traits they ascribed to 
executives and did not regard these traits as 
good. Anti-labor students ascribed their own 
traits to executives much more than to work- 
ers. There was some evidence that the stere0- 
type of the executive was stronger than that 
of the worker, especially in the anti-labor 
students. 


Method 


The aim of the present study was to see wheta 
the technique used by Katz and Braly (6) and CH 
bert (2) to investigate ethnic stereotypes would Te 
comparable occupational stereotypes that would f 
relevant to industrial relations. The technique 1 
Katz and Braly required subjects to choose from z 
list of 84 adjectives the five which in their opinio 
best described members of a particular ethnic gronh 
In the present investigation, 10 occupational group: 
were substituted for ethnic groups and the e 
adjectives was somewhat altered and extended. le 
sample group consisted of 68 male and 56 fema 
university students enrolled for an introductory 
course in Psychology. The age range was 17 Mi 
to 46 years in the male group, with a median Ie 
20.4 years and a mode of 18 years; in the fe 
group the age range was 17 to 30 years, wi 
median of 17.5 years and a mode of 18 years. 
students were told that the project was a rescata 
and completed the schedules in class time. In EA 
dition to naming the adjectives, they were asked je 
rate their political sympathies on a five-point hie. 
running from strongly pro-Labour to strongly P" 5 
Liberal or Country Party (Labour’s political or 
Ponents). They were also asked to rank the j 
occupational groups in order of their preference 
circumstances permitted them to choose any 
them for their life work. the 

The strength of stereotypes was measured by by 
index used by Katz and Braly, which is given T 
the number of adjectives sufficient to account a 
half the total votes cast, Thus if W = 20, and Es 
Subjects list the same five adjectives, each of thes 
would account for 20 votes. The total number ° 


122 


ae 


Study of Occupational Stereotypes 


votes is N X 5=100, and 2.5 adjectives would ac- 
count for 50% of the total votes. Complete agree- 
ment among voters gives a minimum value of 2.5 
for the stereotypy index, whatever the number of 
subjects or adjectives, but the theoretical maximum 
for the index equals half the number of adjectives 
in the list. Thus in the example above, if there were 
10 adjectives to choose from and they were chosen 
completely at random, the chance expectation would 
be that each adjective would be chosen (20 X 5)/10 
=10 times. Five adjectives would then be required 
to account for half the total votes. As Katz and 
Braly used a list of 84 adjectives the stereotypy 
index in their investigation could have gone as high 
as 42, In the present investigation 112 adjectives 
were used, and the theoretical maximum for the 
stereotypy index was 56. 


Results 


Table 1 shows stereotypy indices for the 
ten occupations, and the rank order of pref- 
erence in which the occupations were placed. 
There were no significant differences between 
the sexes, nor between sympathizers with dif- 
ferent political parties. Students expecting to 
enter a particular occupation did not give 
Significantly different responses from those 
expecting to enter other occupations. The 
tank difference correlation between the de- 


gree of stereotypy and order of preference 
for the occupations was + 0.79. Table 1 
e stereo- 


also shows the main content of th 
types, in which many differences are evident. 

Comparing these results with those ob- 
tained in studies of ethnic stereotypes among 
Students, it would appear that occupational 
Stereotypes are approximately as strong as 
ethnic stereotypes. Katz and Braly (6), 
Working with Princeton undergraduates in 
1933, obtained an average stereotypy index 
of 8.5 for ten ethnic groups. The replication 
of their study by Gilbert (2) in 1950 ob- 
tained a much higher average index of 15.3. 
n unpublished studies directed by Taft at 
he University of Western Australia in 1953 
and 1954, students gave stereotypy indices 
of 12.3 and 12.6. - 

Katz and Braly found no relation between 
Tespondents’ preference for an ethnic group 
and the strength of the stereotype for that 
8toup. In the studies at the University of 

estern Australia rank difference correlations 
Of + .79 and + .78 were found, which corre- 
Spond exactly with that found between the 


123 


Table 1 
Stereotypes of Ten Occupations 
(N = 124) 
Stereo- 
t 
Occupation Stereotype Telex 
School well educated, intelligent, 11.7 
Teacher tolerant, fairminded, friendly 
Doctor intelligent, efficient, well 7.9 
educated, humanitarian, 
practical i 
Lawyer alert, calculating, well 11.8 
educated, shrewd, clever 
Arbitration fairminded, intelligent, 10.2 
Court Judge open-minded, practical, 
honest 
Factory ambitious, industrious, 8.5 
Owner practical, efficient, 
progressive 
Politician ambitious, argumentative, 13.4 
power-seeking, talkative, 
evasive 
Factory efficient, industrious, 12.6 
Foreman honest, practical, methodical 
Trade aggressive, determined, 15.5 
Union ambitious, argumentative, 
Leader power-seeking 
Factory friendly, co-operative, 14.9 
Worker honest, imitative, efficient 
Coal Miner rough, tough, friendly, 13.5 


honest, industrious 


ns have been listed in order of preference. 
Only the first five adjectives, in order of frequency of listing, 
Only bwn. The stereotypy index shows the number of adjec- 
tives that account for 50% of the votes cast for each occupation. 


Note.—Occupatiot 


occupational stereotypes and or- 


strength of 
the present study. 


der of preference in 


Discussion 


Admittedly, the Katz and Braly method 
has short-comings, not the least of which is 
the semantics of the large number of adjec- 
tives in the list. The fact remains, however, 
that whatever degree of validity is attributed 
to the ethnic stereotypes obtained by the 
technique must likewise be attributed to oc- 
cupational stereotypes. The existence of such 


124 


stereotypes seems to accord equally well with 
commonsense observation, although we have 
no information on their existence in other 
groups. 

As with ethnic stereotypes, much research 
will be required before we are in a position 
to check the truth or falsity of occupational 
stereotypes. The information available on 
“occupational personalities” is limited in ex- 
tent and heterogeneous in nature and in the 
methods by which it has been obtained (7). 
It is not in a form suitable for comparison 
with stereotypes of the sort found in this 
and previous investigations. On the whole, 
vocational counselors have focused their at- 
tention on abilities and interests. 

The present study throws no light on the 
factors responsible for the stereotypes. In 
Particular, there is no indication of selective 
perceptual distortion arising from political 
group membership, although it is possible 
that distortion might occur in a more sharply 
divided population, such as might be found in 
industry. 

The nature, extent and influence of occu- 
pational stereotypes appear to offer many 
fruitful opportunities for research. The rele- 
vance of such stereotypes to industrial rela- 
tions and personnel management is obvious; 
they are undoubtedly an important factor in 
the way in which members of an occupational 
group are perceived and constitute part of 


K. F. Walker 


the role expectations for an occupation. It 
seems likely that they play an important part 
in vocational choice. A developmental study 
of the stereotypes held by adolescents of dif- 
ferent ages would produce useful information 
on the process of vocational choice and on 
the formation of occupational stereotypes. 
Such research should not be limited to the 
method of Katz and Braly. 


Received July 1, 1957. 


References 


1. Asch, S. E. Forming impressions of personalities. 
J. abnorm. soc. Psychol., 1946, 41, 255-290. 

2. Gilbert, A. M. Stereotype persistence and change 
among college students. J. abnorm. soc. Psy- 
chol., 1951, 46, 245-254. 

3. Haire, M. Role-perceptions in labor-management 
relations: an experimental approach. Industr. 
and Labor Relat, Rev., 1955, 8, 204-216. 

4. Haire, M., & Grunes, W. Perceptual defense cae 
esses protecting and organizing perception 0 
another personality. Human Relat., 1950, 3, 
403-412. 

5. Haire, M., & Morrison, F. School children’s per- 
ceptions of labor and management. Unpub- 
lished manuscript, Univer, California, 1954. R 

6. Katz, D., & Braly, K. W. Racial stereotypes 0 
100 college students. J. abnorm. soc. Psy- 
chol., 1933, 28, 175-193. . 

7. Roe, Anne. The psychology of occupations. New 
York: Wiley, 1956. 

8. Stagner, R. Stereotypes of workers and execu- 
tives among college men. J. abnorm. soc 
Psychol., 1950, 45, 743-748. 


CP 


Journal of Applied P. 
Vol. 42, Meee ied Psychology 


Personality Needs of Under- and Overachieving Freshmen * 


G. Gary Gebhart and Donald P. Hoyt 


Kansas State College 


Reviews of the literature on the differential 
personality characteristics of over- and un- 
derachievers have revealed inconsistent find- 
ings (3, 7). Studies in this area have been 
highly varied in approach, which probably 
accounts for much of this confusion. Experi- 
mental weaknesses may also contribute sig- 
nificantly. For example (a) various control 
factors have been neglected: sex differences 
have been confounded (e.g, 12); level of 
academic progress has varied among Ss (e.g., 
1) ; educational-vocational orientation has not 
been controlled (e.g., 1, 8, 10); differences in 
Over- and underachievement at various abil- 
ity levels have not been considered (eg. 11, 
13, 15). (b) The definitions of over- and 
underachievement have been vague (e.8-, 2, 

). (c) Personality tests developed for 
Quite different purposes have been used (¢-8-, 
1, 5). (d) The Ss were representative of 
ee limited or unusual populations (e.g-, 4, 


The major purpose of this study was to in- 
vestigate some personality correlates of over- 
na) underachievement, while avoiding the ex- 
Perimental pitfalls of the earlier studies. This 
bee to control variables previously Bach 

€d led to three subsidiary purposes © e 
bias These were: to investigate whether 
Or not Ss at different ability levels had dif- 
sop personality needs; to determine whether 
x not the personality correlates of over- and 
pderachievement were the same at ‘various 
ility levels; to investigate personality dif- 
erences between groups having differing vo- 


Cati : 
‘onal orientations. 


Say Procedure 
nple 
this study consisted 


The — : r 
of Population investigated in fied i te Schools 


ale f; 0! * 
of * freshmen who first enri nd Sci- 
Chee Bincering and Architecture OY Al of 1956-57. 

at Kansas State College in the fa 310 Arts 


ti 
ang tt! of 430 Engineering students an 


Slences students were considered. 
MSS. thesis of the 


1 Ti ra 
first his study is taken from the A G 


Steong thor (7), done under the 
À eta 125 


Each school group was subdivided into three abil- 
ity levels. Any student with a predicted grade point 
average? (GPA) of less than .70 (on the 3-point 
system) was defined as a low ability student. Those 
with predicted GPA’s over 1.30 were defined as high 
ability students. All others were called average in 
ability. 

The Ss were further divided into under- and over- 
achievers. If the student’s obtained first semester 
grades were higher than those predicted, he was 
called an overachiever; if lower, he was called an 
underachiever. 

This division provided 12 groups (2 schools, 3 
ability levels, and 2 levels of achievement). The 12 
groups were then examined individually. In each 
group, the 20 Ss whose obtained GPA was most 
discrepant from that predicted were included in the 
present sample. The mean discrepancy between pre- 
dicted and obtained GPA’s for the 240 Ss thus se- 
lected was 1.01. Group means ranged from .72 to 


1.263 


Design 

Scores on each of the 16 variables of the Edwards 
Personal Preference Schedule (EPPS), obtained from 
the freshman testing program, were collected for all 
240 Ss. The study employed a factorial design (2 
schools X 3 ability levels X 2 achievement levels), 
and the statistical tool was the analysis of variance. 
A total of 16 analyses was performed, one for each 


of the 16 EPPS scales. 
Results 


Major Hypotheses 
The major set of hypotheses of the study 
concerned possible differences between over- 


and underachievers. Put in the null form, 
these read: on the variables in question, no 


differences obtain between over- and under- 
achievers. Table 1 summarizes the results of 


these 16 analyses of variance. : 
Of the 16 null hypotheses, 7 were rejected 


2 Local studies had established the Pre-Engineering 
Ability Test to be the best predictor in Engineering 
(r = 60) and the A.C.E. to be the best predictor in 
Aee and Sciences (r =.55). Therefore these were 
iis two ability tests employed in this study. 

3 Because the experimental design required exactly 
20 Ss in each of the 12 groups, it yas necessary to 
include a few students who were “marginal” over- 
a derachievers. Thus, the GPA of seven Ss was 
Cithin 0.5 of the predicted GPA; for the remaining 
733, this discrepancy was at least 0.5. 


126 


Table 1 
EPPS Scores for Groups of Over- and Underachievers 


Under- Over- 
achievers achievers 
(N = 120) (N = 120) 
Edwards 
Need Mean SD Mean SD F 
Ach. 14.4 4.1 16.3 3.8 16.341*** 
Def. 12.0 3.6 Dy 35 2.714 
Ord. IEI 4.2 123 4.5 4,544* 
Exh, 13.8 44 12:3 3:5 1.467 
Aut. 13.6 3.6 134 4.5 — 
Afi. 162 4.2 14.8 4.6 5.559* 
Int. 13.2 4.9 14.5 4.8 3.975* 
Suc. 11.6 43 10.6 4.5 3.231 
Dom, 13.8 4.3 14.9 4.9 1.073 
Aba. 15.3 4.5 16.1 4.8 1.981 
Nur. 15.6 4.7 13.6 4.6 11:5557 
Chg. 15.7 4.0 14.1 4.8 7.808** 
End. 14.4 46 15.2 5:6 1.521 
Het. 15.6 6.1 14.9 6.3 = 
Agg. 12.3 40 12.3 4.9 = 
Cs. 10.5 2.2 11.1 2.0 6.104* 
*P = <.05 
*P = <.01 
wrk P = <.001 


at the 5% level of confidence or beyond. The 
results of these analyses may be summarized 
as follows: (a) Overachievers scored signifi- 
cantly higher on the following scales—Achieve- 
ment, Order, Intraception, and Consistency. 
The mean difference between the two groups 
on Achievement was especially significant. 
(b) Underachievers scored significantly higher 
on the following scales—Nurturance, Affilia- 
tion, and Change. The mean differences be- 


tween the two groups on Nurturance and 
Change were especially significant. 


Subsidiary Hypotheses 


The first set of subsidiary hypotheses con- 
cerned the differences that might exist be- 
tween ability level groups. In the null form 
these hypotheses read: concerning the vari- 
able in question, no differences obtain among 
the High, Average, and Low ability groups, 
The results of the 16 analyses of variance 
are contained in Table 2. 

The null hypotheses were rejected in 9 of 
the 16 analyses. These results may be sum- 
marized as follows: (a) High ability groups 


G. Gary Gebhart and Donald P. Hoyt’ 


scored consistently and significantly higher 
on the Achievement, Exhibition, Autonomy, 
Dominance, and Consistency scales. (b) Low 
ability groups scored consistently and signifi- 
cantly higher on the Deference, Order, Abase- 
ment, and Nurturance scales. 

The second set of subsidiary hypotheses 
was concerned with whether or not the per- 
sonality correlates of over- and underachieve- 
ment were the same at the three ability levels. 
This was tested in the analysis of variance by 
the interaction between ability and achieve- 
ment levels. 

Only 2 of the 16 analyses yielded an F 
large enough to warrant rejection of the null 
hypothesis, which in this case is stated as fol- 
lows: with regard to the variable in question, 
no differences attributable to the interaction 
of ability and achievement obtain among the 
groups. 

The null hypotheses that were rejected 
were found on the Heterosexuality and Con- 
sistency scales. In both cases, .05 > P > 01. 

The interaction found on Heterosexuality 
is illustrated in Table 3. High ability groups 
tended to score higher on this need than 
did low ability groups. Also, underachievers 
tended to score higher than overachievers- 
But low ability overachievers scored highet 
than low ability underachievers. 

The interaction found on the Consistency 
scale is shown in Table 4. Overachievers 
scored significantly higher on this scale than 
did the underachievers, and high ani 
groups scored significantly higher than di 
low ability groups. Yet, high ability under 
achievers scored higher than high ability over 
achievers. k 

The third set of subsidiary hypotheses CO? 
cerned the differences between groups W! 1 
different vocational orientations. In the an 
form, the hypotheses may be stated as E 
lows: concerning the variable in question, He 
EPPS differences obtain between students Ý 
the two schools (Engineering and Arts am 
Sciences). 3 

The null hypothesis was rejected twice 
Engineering groups scored significantly nighe 
on need Endurance (P < .001), while om 
and Sciences students scored significant 
higher on Dominance (P < 05). 


| 
i 


Under- and Overachieving Freshmen 


127 


Table 2 
EPPS Scores for Groups at Different Ability Levels 
( are Avene, Low 
N= N = 80; N = 80 
Edwards - : f ) ¢ ) 
Need Mean SD Mean SD Mean SD F 
Ach. 16.9 3.0 15.5 3.6 13.7 3.6 14.310*** 
Def. 11.2 3.2 12.9 255 13.1 3.2 LEENE ian 
Ord. 10.7 4.2 11.9 4.0 12.4 3.4 3.131% 
Exh. 15.6 3.4 14.6 3.1 13:5 31 6.725** 
Aut 14.6 3.8 13.0 3.2 12.9 3.6 4.330* 
Afi. 15.0 k ri 15.4 4.3 16.0 3.8 — 
Int. 13.2 44 14.0 47 14.4 Buh 1.280 
Suc. 11.3 4.3 11.3 4.1 10.6 3.2 — 
Dom 15.8 4.1 14.8 4.1 13.2 35, TATY iiia 
Aba 14.5 4.2 16.0 4.2 16.6 3.8 4.702** 
Nur. 13.7 5.3 146 41 15.5 43 3.129* 
Chg. 15.0 4.3 14.2 4.3 15.5 3.3 1.689 
End. 141 43 148 5.2 154 42 1.344 
Het. 16.3 4.9 14.9 6.1 14.6 5.3 1.898 
Agg. 122 41 fade a2 12.6 3.6 = 
Cs ms 1% 108 18 i03 m21 5.072** 
*P = <05 
P= : 
"p = Soo. 
Discussion be associated with the same behaviors (aca- 
demic achievement) but for different reasons. 


In the interest of brevity, only the findings 
relating to the major hypotheses will be dis- 
cussed, 

Gough (8) has recently published two 
Personality scales designed to predict aca- 

emic achievement. These scales are called 
Ac (achievement via conformance) and Ai 
(achievement via independence) A implying 
that different personality characteristics may 


Table 3 


Heterosexuality Scores for Over- and 
at Three Ability Levels 


Underachievers 


While little support for Gough’s suggested 
dichotomy was found in this study (neither 
Deference nor Autonomy was associated with 
under- or overachievement), the suggestion 
regarding a variety of personality patterns 
among achievers and non-achievers does ap- 


pear to have merit. 
On the basis of the present study, three 


Table 4 


ncy Scores for Over- and Underachievers 
at Three Ability Levels 
(N = 40 in each group) 


Consiste 


(N = 40 in each group) 
Ability Ability 
Achiey, High Average Low Achievement High Average Low 
ement 1g: 
Und Under 
er 11.5 10.3 9.7 
3.7 Mean 
ae 167 15 ae SD 1.6 1.8 2.6 
o l Over 
yee 34 15.4 Mean 11.2 11.3 10.9 
a 160 #3 63 SD 2.0 2.0 1.7 
4. 5 


128 


different patterns of overachievement can be 
hypothesized: (a) overachievement associated 
with a drive to complete (Achievement); (b) 
overachievement associated with a drive to 
organize or plan (Order); and (c) over- 
achievement associated with intellectual curi- 
osity (Intraception). Similarly, two patterns 
of underachievement may be hypothesized: 
(a) that associated with a need for variety 
(Change), wherein academic studies may ap- 
pear boring and routine; and (b) that asso- 
ciated with social motives (Afñliation, Nur- 
turance), wherein friendship may be placed 
above scholarship. The fact that the scales 
involved do not intercorrelate significantly 
(6) supports the notion that several rela- 
tively distinct patterns, rather than a single 
pattern, are involved. 

Further studies to test these hypotheses 
would seem worth while. A pattern analysis 
of the EPPS or an intensive clinical study 
of under- and overachievers would probably 
throw additional light on this problem. 


Summary and Conclusions 


The major purpose of the study was to in- 
vestigate relationships of scores on the EPPS 
to under- and overachievement. Minor pur- 
poses were to study EPPS differences between 
schools and between ability levels, and to in- 
vestigate interactions among these three fac- 
tors. 

A sample of 240 freshman men at Kansas 
State was chosen such that 20 were included 
in each of 12 groups. The groups were classi- 
fied as to achievement (under or over), abil- 
ity (high, average or low), and school (En- 
gineering or Arts and Sciences). 

Within the limits of the sample employed, 
the following conclusions were reached: 

1. Overachievers scored significantly higher 
than underachievers on the Achievement, Or- 
der, Intraception and Consistency scales, and 
significantly lower on the Nurturance, Affilia- 
tion, and Change scales. 

2. High ability Ss score significantly higher 
than those of low ability on the Achievement, 
Exhibition, Autonomy, Dominance, and Con- 
sistency scales, and significantly lower on the 
Deference, Order, Abasement, and Nurtur- 
ance scales. 

3. Engineering students scored significantly 
higher than Arts and Sciences students on the 


G. Gary Gebhart and Donald P. Hoyt 


Endurance scale, and significantly lower on 
the Dominance scale. 

4. Two interactions between ability and 
achievement levels were found, one on the 
Heterosexuality scale and the other on the 
Consistency scale. 


5. Hypotheses regarding need patterns of 
under- and overachievers were developed. 
Received July 2, 1957. 


References 


1. Altus, W. D. A college achiever and non-achiever 
scale for the Minnesota Multiphasic Person- 
ality Inventory. J. appl. Psychol., 1948, 32, 
385-397. : 

. Bennett, G. K, & Gordon, H. P. Personality 
test scores and success in nursing. J. appl. 
Psychol., 1944, 28, 267-278. 

3. Burgess, Elva. Personality factors of over- and 
under-achievers in engineering. Unpublished 
doctoral dissertation, Pennsylvania State Coll., 
1953. , 

4. Cash, W. L., Jr. The relation of personality 
traits to scholastic aptitude and academic 
achievement of students in a liberal Protestant 
seminary. Disser. Abstr., 1954, 14, 630-631. 

5. Cooper, J. G. The inspection Rorschach in the 
prediction of college success. J. educ. Res» 
1955, 49, 275-283. 

6. Edwards, A. L. Manual for the Edwards Per- 
sonal Preference Schedule. New York: Psy- 
chological Corp., 1954. R 

7. Gebhart, G. G. Personality factors in academic 
achievement. Unpublished master’s thesis, Kan- 
sas State Coll., 1957. S 

8. Gough, H. G. Factors related to differentia’ 
achievement among gifted persons. Berkeley: 
Inst. of Personal. Assess. and Res., Univer. 
California, 1955. Pp. 1-22 (Mimeo.). 

9. Gough, H. G. Manual for the California Psy- 
chological Inventory. Palo Alto: Consulting 
Psychologists Press, 1957. 

10. Gustad, J.W. Academic achievement and Suor 
occupational level scores. J. appl. Psychol» 
1952, 36, 75-78. 7 

11. McArthur, C. C, & King, S. Rorschach ey 
figuration associated with college achievement- 
J. educ. Psychol., 1954, 45, 492-498. F 

12. Osborne, R. T., Sanders, Wilma B., & anit: 
J. E. The prediction of academic success th 
means of “weighted” Harrower-Rorschach ge 
sponses. J. clin. Psychol., 1950, 3, 253-258- A 

13. Owens, W. A, & Johnson, Wilma C. Som? 
measured personality traits of collegiate under 
achievers. J. educ. Psychol., 1949, 40, gmn 

14. Waggoner, R. W., & Zeigler, T, W, Psychiatri 
factors in medical students who fail. Ame” 
J. Psychiat., 1946, 103, 369-376, its 

15. Zelman, W. R. The relationship of the tral 
of the Bernreuter Personality Inventory |” 
academic success. J. Amer. Assn. Coll. Regis 
trars, 1945, 21, 81-84. 


N 


n E 


Proficiency. Although 10 
a 


Journal of Applied Psy 
Vol. 42, Noe ci aa 


A Factor Analysis of Aptitude and Proficiency Measures in 


Radiote 


Edwin A. 


legraphy * 


. Fleishman 


Yale University 


Millard M. Roberts 


Universit: 


y of Florida 


and Morton P. Friedman 


Ohio State University 


The studies of radiotelegraphy by Bryan 
and Harter (1, 2) probably represent the first 
of all attempts to apply quantitative, scien- 
tific procedures to the systematic study of 
Skill performance. Since this pioneer study, 
skill in radiotelegraphy has been examined 
from a variety of viewpoints. West (13), as 
well as Windle, Sidman, and Keller (15), 
have provided recent comprehensive reviews 
Of the extensive variety of studies on learn- 
ing and training; Taylor (10), and more 
recently Craeger (3), have reviewed studies 
on selection procedures for radiotelegraphers. 

he primary task investigated in most of 
these previous studies was skill in receiving 
International Morse Code. 

Despite the variety of studies on Morse 
Ode reception, there is actually very little 
‘nown about the nature of the fundamental 
abilities underlying proficiency in this skill. 
Th general, evidence in the area of selection 
indicates that printed, academic type tests 
achieve only low predictions of code profi- 
ciency, and certain auditory tests yield 
igher predictions. Beyond this we know 
little about the fundamental aptitudes in- 
Volved. A recent study by Fleishman (4), 
through the inclusion of a large variety of 
Auditory tests, sought to specify more pře- 
Cisely the Ainds of auditory measures provid- 


ing the best predictions of subsequent code 
different auditory 


while the writers 


1 This med 
research was perfor! l and Training Re- 


Wer ry 
re with the Air Force Personne 


earch Center, Lackland Air Force Base, Texas, i 
Support She ec. H 106, Task No. 27002. Per- 
: of Project No. 7706, Nanslation, pub- 


Pare is granted for reproduction, tra? i 
ation, use, and disposal in whole or in par! 


Or the U, S. Government. 


by or 


tests were evaluated, it was found that a com- 
bination of two tests (a measure of initial 
code learning and a measure of rhythm dis- 
crimination) yielded maximum prediction and 
beyond this, addition of other measures pro- 
duced no increase in prediction. This was 
true even though a variety of the remaining 
auditory tests possessed significant individual 
validities. It is obvious that in order to ef- 
fect improvements in current selection pro- 
cedures, we need to know more about the 
fundamental abilities involved in such tests 
as well as in our criteria of code-receiving 
proficiency. Such knowledge would have im- 
plications, not only to problems of code pro- 
ficiency, but also to the broader problems 
concerning the nature of auditory-perceptual 
processes. 

The present study is an attempt to obtain 
this kind of information through the applica- 
tion of factor analysis techniques to both 
aptitude measures and proficiency criteria. It 
represents a follow-up to our earlier study 
(4, 6). The present study utilizes a different 
sample and improved measures of certain of 
the auditory tests previously found valid by 
Fleishman (4, 6). In addition, we included 
certain printed tests in aptitude areas not 
previously evaluated against code proficiency. 


Method 


A battery of 14 specifically selected or designed 
tests was administered to 310 airmen prior to their 
entrance into radio operator training. Five of these 
tests were auditory tests (recorded on high fidelity 
tape); the remaining nine were printed tests. All of 
these were hypothesized to measure abilities relevant 
to success in the learning of Morse Code. A brief 
description of each test variable follows: 


129 


130 


Aural Tests 


1. Rhythm Discrimination (Form D).2 This is 
our fourth revision of an adaptation of the rhythm 
subtest of the Seashore Measures of Musical Talent 
(4, 6). The examinee hears a series of 70 pairs of 
rhythm patterns (beats within each pair presented 
in rapid succession). After each pair, he must mark, 
on an IBM answer sheet, under the S if the pat- 
terns in each pair are the same, under the D if they 
are different, and under the (?) if he cannot decide. 

2. Dot Perception Test (Form C). This is our 
third revision of a test described previously (4, 6). 
The examinee hears a series of code signal groups, 
consisting of rapid patterns of “dots” and “dashes.” 
Each signal group is about the same over-all length 
in terms of time, but the internal arrangement of 
the “dots” and “dashes” varies from item to item. 
The “dots,” however, always come in a single series, 
at the end or beginning of the signal group. For 
each group (items) the examinee simply marks (on 
an IBM answer sheet) the number of “dots” pre- 
sented (1, 2, 3, 4, or 5) within each group. The 
speed of transmission increases at intervals through 
the test. 160 items. 

3. Copying Behind. The examinee hears groups 
of numbers called out in rapid succession (e.g., 4-2- 
5-1-3). His task is to mark under the proper num- 
ber, in turn, on an IBM answer sheet, attempting to 
keep up with the pace set by the narrator. The pace 
increases at intervals through the test. 240 items. 

4. Hidden Tunes. This test has been described 
earlier by White (14). The examinee hears a series 
of short tunes presented in pairs. The second tune 
is always longer than the first in each pair, and the 
examinee must determine if the second tune in- 
cluded the first. For each pair he marks under yes 
or under xo on his answer sheet. 50 items. 

5. Army Radio Code Test (ARC). This test (3, 
4, 6) is designed to measure the speed with which 
the examinee can learn certain actual Morse Code 
signals (for I, N, and T). About 25 min, of the 
test involves the practice, with knowledge of results, 
of these signals under increasingly higher rates of 
speed. For the test period the examinee marks un- 
der the I, N, or T on the special IBM answer sheet 
as each signal (dot-dash, dash-dot, dash) is pre- 
sented in rapid succession. 150 test items. 


Printed Tests 


Variables 6 through 10 are tests developed by 
Thurstone to measure “closure” factors.4 Variables 
6 through 8 are reported to be reference tests of a 


* The writers are indebted to Harold Seashore and 
The Psychological Corporation for permission to 
modify the Seashore test for our purposes, 

3 The writers are indebted to Edward L. Walker 
and Benjamin White for providing a COpy of this 
test. 

* Permission to reproduce these tests for our pur- 
poses was granted by L. L. Thurstone. 


Edwin A. Fleishman, Millard M. Roberts, and Morton P. Friedman 


“Speed of Closure” factor, defined as “the ability to 
unify an apparently disparate perceptual field ny 
a single percept” (7). It seemed a reasonable y 
pothesis that receiving Morse Code may involve E 
an ability. At least it appears that a stream o 
stimulus signals must somehow be unified, given 
Structure, and broken into the appropriate units. 
The tests representative of this ability were: ; 

6. Four-letter Words. Twenty-two 46-letter lines 
of capital letters are presented on a printed pest. 
The task is to circle all the four-letter words er 
can be found spelled out in this array. Thus, 3 
examinee scans along these lines and every time a 
finds four letters in sequence which spell a word, he 
circles them. d 

7. Mutilated Words. Each item presents a eee 
with parts of each letter missing. The examin 
writes out the full word in an adjacent space. ? 

8. Gestalt Completion. This is Thurstone’s adap} 
tation of the Street Gestalt Completion Test. ey, 
ings are presented which are composed of ar 
blotches representing only suggestive parts of Ha 
objects portrayed. The examinee attempts to ae 
sense” out of these and for each drawing wr 
down the name of the object. ate 

Variables 9 and 10 are reported as reference pam 
of a “Flexibility of Closure” factor, which is defi i 
as “the ability to keep in mind a definite spare a 
tion so as to identify it in spite of perceptual a 
tractions” (7). This factor differs from the mae 
of Closure” factor in that the examinee knows 
particular configuration he is looking for. a 

9. Designs. Three hundred geometrical eS 
presented, in 40 of which a sigma (X) is embed id 
The task is to mark the designs in which the sig 
occurs. «ate 

10. Concealed Figures. This is Thurstone’s < is 
tation of the Gottschaldt Figures Test. The oa 
to select the one of five given geometrical fig ee 
that is contained in a more complex geomet! 
figure. er y 

11. Marking Accuracy. The task is sımp 8 i, 
mark a standard IBM answer sheet in which or ate 
the five alternatives to each item has been ae is 
printed with a small circle, The examinee’s tas os- 
merely to mark the answer sheet as rapidly as pi 
sible under the indicated circles, going from oun- 
item to the next. Ina sense, this is the visual “it r 
terpart of the aural Copying Behind Test desc 
above. 


i oic? 
12. Word Knowledge. This is a multiple-ch 
vocabulary test. 


13. Background for Current Affairs. This 5 
informational test covering current, recent, aN 
torical events. This test, together with the 
Knowledge Test, has consistently defined a Y 
factor on previous Air Force studies. aw 
14. Pattern Comprehension (9). A series of Ore 
ings require the examinee to visualize the relat i 


ships between components of solids and their 
folded flat projections. 


an 
his 
word 
erbal 


| 


wy aa 


—— i 


Aptitude and Proficiency Measures in Radiotelegraphy 131 
Table 1 
Intercorrelations 
(V = 310) 

Be eh ee ee ee 
1 2 3 4 5 6 7 8 9 t0 11 2 aS a Sa 

L Khyihm Dinimi — 32 2S s N © E u or 0s -or-o e 20 
2. Dot Perception eS = auauai pua DD DW ww ana 
3. Copying Behind 5s ay = 3O $4. 18° 28 11 a 32. Bh) 26,5 Zia 
4. Hidden Tunes 38 46 32 — 15 11 18 10 19 23 Ol 16: :O7 ~ 250-19 
5. Army Radio Code 5 u a i- «OF 42 08 19 2h 18 aa 
6. Four-letter Words 0 17 #21 #12 02 — 4 26 30 20 15 29:22 1S1 
7. Mutilated Words 00 23 32 20 05 45 — 36 25 2 19 42: 35 19 27 
8. Gestalt Completion p 04 14 12 B 2 38 — 27 36 213i) rezi 26-0 
9. Designs 08 22 25 20 13 31 27 25 — 45 21 21 18 43 06 
10. Concealed Figures 09 24 36 25 19 ‘22 27 37 47 — 29 25 31 53 02 
11. Marking Accuracy op 03. 23, 02 10 16 21 22,2228 — 09° 2) 2350Gs 
12, Word Knowledge a 96 32 19 10 Si) 45 30° 25°20 © 12) N 
U Backi. Cun Adae =O 18 22 Ids a ee eS ee 31 07 
14. Pattern Comprehension 15 19 7 27 2 -2 36. ASS 26 SiS 00 
[ii WokceneyCrieios A WS Zl 2 Te Gian ee 7. 12>, OL 


Note.—Values 


restriction of range. Decimals omitted. 


The Criterion of Code Proficiency 


15. Number of Classroom Days to Attain a Code 
Receiving Speed of 14 Groups Per Minute. All stu- 
dents in the course receive frequent code checks un- 
der comparable conditions. Students are held back 
Until the checks at each code speed are passed satis- 
factorily, Time to reach given code speeds repre- 
Sents a uniquely unambiguous criterion of proficiency. 
( ‘he advantages of a “time to learn” criterion 1N 
this context have been described recently by Gordon 

-) The criterion chosen was that of 14 groups 
Per minute which is the stage at which the student 
Must qualify to be admitted to further phases of 
radio operator training. There are wide individual 
ìfferences, eyen after selection, in the amount of 
time needed to reach this criterion. Practically all 
failures in the course are attributed to difficulties in 
receiving Morse Code, rather than to difficulties in 
Sending code or in other academic subjects. 


Data Analysis Procedures 
The intercorrelations among these 15 hpa 
S obtained. These are presented in the upp 
alf of Table 1. Si riter 
e 1. Since the C ' ; 
ng of time to attain a given proficiency level, e 
aldities obtained were uniformly negative. 


ign oses. Thus, the 
S were reflected for our PUFP that a particu- 


Positiy eal dicate 
e validities in Table 1 1nale FA aa 
RF test is positively related to “B00 Oe M). 
mance (low amount of da} 7 
f ince all of the Ss in our sample en 
trainin is of their sco 
g on the basis 0? | 
an Classification Battery; it was necessary 


ion variable is in 


above the diagonal are the obtained (restricted 


) correlations; values below the diagonal are corrected for 


rect these correlations for restriction of range. The 
basis for selection was the Radio Operator Aptitude 
Index (ROAI), a composite score derived from five 
classification tests. Corrections of all the obtained 
correlations were made in accordance with the pro- 
cedures outlined by Thorndike (11). The bottom 
half of Table 1 presents the corrected coefficients. 
These are the coefficients utilized in the factor 
analysis. 

Six factors were extracted from the correlation 
matrix using Thurstone’s Centroid Method (12). 
Orthogonal rotations were accomplished, using Zim- 
merman’s graphical procedure (16), until simple 
structure appeared to be closely approximated. The 
sixth centroid factor extracted was considered to 
contain only residual variance and was not rotated. 


Results 


Table 2 presents the centroid and rotated 
factor matrices. The rotated factors were 
interpreted for psychological meaningfulness. 
Loadings above .30 are listed in turn for each 


factor. 


Factor I is interpreted as Visualization. 


No. Variable Loading 
14 Pattern Comprehension 69 
10 Concealed Figures .66 

9 Designs 54 
8 Gestalt Completion 50 
11 Marking Accuracy 32 


132 Edwin A. Fleishman, Millard M. 


Roberts, and Morton P. Friedman 


Table 2 


Centroid and Rotated Factor Matrices 


Centroid Loadings* 


Rotated Loadings* 


Factors Factors® 
I mi IV V ve r 
Variable = eee w Ve v AR Cs APS Res h? 
1. Rhythm Discrimination 29 41 —22 —14 —19 12 14 —11 56 13 00 12 Fs 
2. Dot Perception 54 48 05 —04 —04 —19 07 10 58 28 32 —19 a 
3. Copying Behind 62 29 06 19 16 07 19 11 32 31 54 O7 30 
4. Hidden Tunes 46 34 —24 —24 —21 -04 28 04 62 15 —01 —04 = 
5. Army Radio Code 46 28 10 05 28 —06 08 15 30 09 51 —06 i 
6. Four-letter Words 46 —19 14 17 —34 05 25 23 02 54 —02 05 4 
7. Mutilated Words 58 —17 23 08 —31 16 25 38 07 55 03 16 yr 
8. Gestalt Completion 46 —43 —06 —15 —06 16 50 40 —06 12 —04 16 45 
9. Designs 53 —19 ~23 18 —04 —23 54 05 02 27 16 — 23 E 
10. Concealed Figures 6i =21 —31 09 18 ~i1 66 12 04 10 31 —11 26 
11. Marking Accuracy 32 —20 -07 25 15 17 32 05 —17 15 28 17 70 
12. Word Knowledge 63 —22 39 —30 04 —06 20 76 14 19 16 —06 ro 
13. Backgr. Curr. Afirs, 59 —33 30 —28 27 08 28 74 00 O1 28 08 57 
14. Pattern Comprehension 58 —21 —38 —07 15 —13 69 15 12 —01 20 —13 3 
15. Proficiency Criterion 39 37 32 16 —09 —05 —16 12 31 44 31 —05 
Za?/k 26 09 06 04 05 02 13 11 09 08 08 02 
a Decimals omitted, 
b Factor VI—not rotated. d of 
e Factors are identified as follows: I—Visualization; II—Verbal Ability; I1I—Auditory Rhythm Perception; IV—Speet 
Closure; V—Auditory Perceptual Speed; VI—Residual, 


Although three of these variables have been 
listed (7) as definers of “closure” factors, in- 
terpretation as the better established visuali- 
zation factor appears much less strained. As 
we shall see, a separate closure factor was 
identified in our analysis, and the three 
“closure” type tests on the present factor do 
not represent the same closure factor accord- 
ing to Thurstone’s definitions (7). Thus, 
Designs and Concealed Figures represented 
his “Flexibility of Closure” factor, and Ge- 
stalt completion is purported to be a meas- 
ure of “Speed of Closure.” Furthermore, the 
main definer of Factor I, “Pattern Compre- 
hension,” has been found repeatedly (5, 9) 
to define Visualization. Tests of this factor 
seem to require mental manipulation of visual 
objects, in which it is necessary to move, 
twist, turn, or invert one or more parts of a 
configuration (in imagination) and to recog- 
nize the new position, location, or changed ap- 
pearance after the modification. This seems 
to fit Pattern Comprehension as well as the 
Thurstone closure tests, all of which involve 


shapes. The loading of Marking Accuracy 
on this factor is not explainable in terms 0 
this definition, but it is to be noted that there 
is quite a gap between the main definers Re 
this factor (loadings .50-.69) and this latte 
test (loading .32). Whatever the nature te 
this factor, we find that the criterion of CO 
proficiency is not loaded on it. 


m r 
Factor II is identified as the Verbal Ability facto 


ing 
No. Variable ae 
12 Word Knowledge + 
13 Background for Current Affairs 40 
8 Gestalt Completion 38 
7 Mutilated Words j 


The main definers are those which n 
defined the Verbal Ability factor in man 
previous studies (e.g., 5, 9). Loadings a 
Gestalt Completion and Mutilated Words 4 f 
consistent with this interpretation. The f° 
mer requires the spelling out of words ae 
the latter, the recognition of words. , T t 
test Four-letter Words had the next higheS 


—— 


a a 


Aptitude and Proficiency Measures in Radiotelegraphy 


loading (.23) on this factor. Again, the cri- 
terion of code proficiency does not load on 
this factor. 


Factor ITT is interpreted as Auditory Rhythm Perception. 


No, Variable Loading 
4 Hidden Tunes 62 
2 Dot Perception (Form C) 58 
1 Rhythm Discrimination (Form D) 56 
3 Copying Behind a2 
5 ARC 30 
15 Criterion 31 


The factor is common to all of the aural 
tests included and to none of the printed 
tests. Perception of rhythm appears to be 
the most critical feature of the three tests 
with loadings above .50, in all of which the S 
receives his stimulus signals in groups. This 
factor appears general to auditory tasks re- 
8ardless of the kind of auditory signal in- 
volved (e.g., code signals, tunes, beats, call- 
ing of numbers). Of special interest is the 
loading of the Morse Code proficiency cri- 
terion on this factor. 


Factor IV is identified as Speed of Closure. 


No. Variable Loading 
7 Mutilated Words 55 
6 Four-letter Words 54 
3 Copying Behind 31 
15 Criterion A4 
r are the 


t The main definers of this facto 
ests designed by Thurstone as measures of 
Si Speed of Closure factor- In Mutilated 

ords and Four-letter Words, the S does not 
now the stimulus unit that he is looking for, 
Out must “organize” the stimulus material 
înto meaningful units. Tt is to be noted that 

e “closure” tests not imposing this require- 
ment do not appear on this factor. As a 


Methodological point, it should be noted that 
rbal tests 1m the 


artial out the ver- 
d in the cri- 


al 
us on this factor. 
original hypothesis that 


orse Code, a major task is t° group an ap- 


133 


parently disparate auditory perceptual field 
into meaningful units. The finding that this 
kind of closure factor is common to auditory, 
as well as to visual, perceptual tasks would 
appear to have important general significance. 


Factor V is identified as an Auditory Perceptual 


Speed factor. 
No. Variable Loading 
3 Copying Behind 54 
5 Army Radio Code St 
2 Dot Perception 32 
10 Concealed Figures 31 
15 Criterion 31 


The three auditory tests loaded on this fac- 
tor are the ones which emphasize speed the 
most. Copying Behind was especially de- 
signed as a speed measure, and both the Army 
Radio Code Test and the Dot Perception Test 
carry the subject to increasingly high speed 
levels. In each test, it is necessary for the S 
to respond almost immediately or he will miss 
the next stimulus presented. In most cases 
he is responding to an earlier stimulus at the 
same time that new stimuli are being pre- 
sented. The two auditory tests, Rhythm 
Discrimination and Hidden Tunes, which do 
not appear on this factor, allow the S more 
time to make a response after the presenta- 
tion of each stimulus pair. 

The low, but significant, loading of the 
Concealed Figures Test is, of course, not en- 
tirely consistent with our interpretation of 
this factor as “Auditory Perceptual Speed.” 
It is possible that the presence of Concealed 
Figures suggests that this factor is the same 
as the Perceptual Speed factor found among 
certain kinds of printed tests involving rapid 
discrimination of visual detail (7, 9). How- 
ever, there is no way to assess this in the 
present study as no reference tests of Per- 
ceptual Speed have been included. The pos- 
sibility that Perceptual Speed may extend to 
auditory and visual perception is worthy of 
future study. 

The Marking Accuracy Test, which sam- 
ples sheer speed of marking in the slots of an 
answer sheet, has a loading of only .28 on 
this factor. Since this test requires responses 
identical to the Copying Behind Test, but in- 
volves no appreciable stimulus discrimination, 


134 


it would appear that emphasis on this factor 
is on speed of stimulus discrimination and not 
on speed of response. 

As can be seen, the criterion of Morse Code 
reception is loaded on this factor. This seems 
completely consistent with our factor descrip- 
tion inferred from the loadings of the test 
variables. 


Discussion 


These results suggest that the following 
three ability factors contribute to individual 
differences in proficiency in Morse Code re- 
ception. 

1. Speed of Closure. The ability to unify 
an apparently disparate perceptual field into 
meaningful component units. 

2. Auditory Rhythm Perception. The abil- 
ity to discriminate rhythmic patterns inher- 
ent in particular auditory stimulus groups. 

3. Auditory Perceptual Speed. The speed 
with which an individual can discriminate in- 
dividual auditory stimulus signals presented 
in rapid succession. 

From the communality estimate (Table 2) 
it was also shown that the factors identified 
in the present study accounted for 43% of 
the variance in the code proficiency criterion. 
This suggests additional factors may yet be 
found through the inclusion of additional 
kinds of ability variables. However, a com- 
munality of this magnitude implies a possible 
multiple R of .66 may be achieved (VR?) 
using the kinds of predictor variables investi- 
gated here. This is higher than has been 
achieved in previous studies of radioteleg- 
raphers in this operational training setting 
(3, 4, 6). Multiple correlational studies are 
now underway, using a wider variety of pre- 
dictor and criterion variables. 

The present results confirm earlier indica- 
tions that aural tests are likely to give better 
predictions of code proficiency than most 
printed test variables. These results help 
rationalize this empirical finding in terms of 
underlying abilities. A major contribution 
of the present study is the finding of a 
“closure” factor in code proficiency. Meas- 
ures of this domain have not been included 
in previous studies on the selection of radio- 
telegraphers. 


Edwin A. Fleishman, Millard M. Roberts, and Morton P. Friedman 


It should be stressed that the relative im- 
portance of the factors identified may not 
hold for very early or very advanced levels 
of proficiency. Studies are in progress On 
possible changes in such patterns as a func- 
tion of practice. It is possible that studies 
of this type, done at success levels of profi- 
ciency, may throw some light on the proc- 
esses involved in the learning of Morse Code. 

Methodologically, these results indicate pi 
properly designed factor analyses, including 
both predictor and criterion variables, may 
provide important leads regarding the fun- 
damental abilities underlying proficiency 1° 
complex jobs. 


Summary 


Fourteen auditory and printed aptitude 
measures were administered to students prior 
to entry into training for radiotelegraphy- 
These 14 measures, together with a orien 
of proficiency in learning to receive be 
Code, were subjected to factor analysis stu Me 
Five factors were identified as Visualiza aA 
Verbal Knowledge, Speed of Closure, AU i 
tory Rhythm Perception, and Auditory Per 
ceptual Speed. Three of these, Auditor! 
Rhythm Perception, Auditory Petter yan 
Speed, and Speed of Closure, were found 5 
contribute to the criterion of subsequent CO 
proficiency. 


Received July 12, 1957. 


References 


. the 

1. Bryan, W. L, & Harter, N. Studies in m 
physiology and psychology of the telegrap 
language. Psychol. Rev., 1897, 4, 27-53- tele- 

2. Bryan, W. L., & Harter, N. Studies on the hier- 
graphic language; the acquisition of a 345- 
archy of habits. Psychol. Rev., 1899, 6, 

375. 

3. Creager, J. A. Comparative validation 
radio code tests for use in conjunction 
the Airman Classification Battery 
lection of radio operator trainees. 
AFPTRC-TR-54-65. Lackland AFB, arc 
Air Force Personnel and Training Res 
Center, 1954, s of 

4. Fleishman, E. A. Predicting code proficiency J: 
radiotelegraphers by means of aural tests- 
appl. Psychol., 1955, 39, 150-155. rels- 

5. Fleishman, E. A., & Hempel, W. E. The with 
tion between abilities and improvement 


of two 
wi 
p 


>. 


Aptitude and Proficiency Measures in Radiotelegraphy 


practice in a visual discrimination reaction 
task. J. exp. Psychol., 1955, 49, 301-312. 

6. Fleishman, E. A., & Spratte, J. G. The predic- 
tion of radio operator success by means of 
aural tests. Tech. Rep. AFPTRC-TR-54-66. 
Lackland AFB, Texas: Air Force Personnel 
and Training Research Center, 1954. 

7. French, J. W. (Ed.). Manual for kit of selected 
tests for reference aptitude and achievement 
factors. Princeton, New Jersey: Educational 
Testing Service, 1954. 

8. Gordon, L. V. Time in training as a criterion 
of success in radio code. J. appl. Psychol., 
1955, 39, 311-313. 

9. Guilford, J. P. (Ed.). Printed classification tests. 
AAF Psychol. Program Res. Rep. No. 5. 
Washington: U. S. Government Printing Of- 
fice, 1947. 


10. 


Ir 


12. 


13. 


14. 


16. 


135 


Taylor, D. W. Learning telegraphic code. Psy- 
chol. Bull, 1943, 40, 461—487. 
Thorndike, R. L. Personnel selection. New 


York: Wiley, 1949. 

Thurstone, L. L. Multiple factor analysis. Chi- 
cago: Univer. of Chicago Press, 1947. 

West, L. J. Review of research on Morse Code 
learning. Res. Rep. AFPTRC-TN-55-52. 
Lackland Air Force Base, Tex.: Air Force 
Personnel and Training Research Center, De- 
cember 1955. 

White, B. W. Visual and auditory closure. J. 
exp. Psychol, 1954, 48, 234-240. 


. Windle, C., Sidman, M., & Keller, F. S. Studies 


in radiotelegraphy. New York: Columbia 


Univer., 1953. 
imple graphical method 


Zimmerman, W. G. A si 
for orthogonal rotation of axes. Psycho- 


metrika, 1946, 11, 51-55. 


Journal of Applied Psychology 
Vol. 42, No. 2, 1958 


Recovery From Unusual Aircraft Attitudes Under the 
Influence of Vertigo ' 


J. E. Conklin and O. H. Lindquist 


Minneapolis-Honeywell Aeronautical Division 


The conventional aircraft attitude indicator 
is termed an inside-out display since it repre- 
sents the artificial horizon as the moving ele- 
ment, similar to what the pilot would per- 
ceive in contact flying. An outside-in dis- 
play represents the same information by 
presenting the artificial horizon as the sta- 
tionary element and depicting roll and pitch 
deviations by a moving drone. Research has 
demonstrated that the latter presentation of 
aircraft attitude has certain advantages over 
the more “realistic” indicator (1,2, 3). That 
the moving drone concept, however, is not 
currently used is due to a concern over the 
transfer effects between instruments. Recent 
flight tests have shown that transfer from a 
moving horizon to a moving drone display is 
quickly achieved. It is believed, however, 
that unless all aircraft are equipped with a 
moving drone display the transfer effect from 
this display to the conventional one may be 
negative, and a possible cause of accident; 
particularly, when the pilot is faced with a 


Critical situation such as disorientation due to 
vertigo. 


Method 


The study was designed to measure recovery per- 
formance with two contrasting concepts of attitude 
indication under the influence of vertigo. It was 
also designed to test whether extensive training with 
the moving drone display interferred with recovery 
performance with the moving horizon or conven- 
tional indicator. 

Apparatus. The apparatus consisted of (a) the 
moving horizon and moving drone displays with as- 
sociated circuits, (b) analog computers to simulate 
aircraft dynamics, (c) problem generator for train- 
ing purposes, (d) joy-stick control mounted on a 
chair modified for rotation and equipped with a 
safety belt and foot rest, and (e) a two-channel] 
Brush recorder. 


1 This study was conducted by the research depart- 
ment of the Minneapolis-Honeywell Aeronautical Di- 
vision. The authors are indebted to Alex Weisz for 
Stimulating this research. The pilots, Jim Bradford 
and Carl Gruber, are also thanked for their partici- 
pation in this study. 


Subjects. Two experienced pilots served as sub- 
jects for this study. ; 

Procedure. The experiment comprised six experi- 
mental sessions for each pilot. The first and last 
hour of each session was devoted to training with 
the moving drone display with one exception; 1e- 
on the first day, only the moving horizon indicator 
was practised. Between the two training periods, 
the pilot’s ability to recover from unusual attitudes, 
Subsequent to semicircular canal stimulation, was 
tested with both indicators. , 

Each training period consisted of a continuous 
tracking task for 20 two-min. trials with a anem 
rest between trials. The recovery tests consisted = 
16 items per indicator, comprised of a displaceme? 
in pitch and roll, chosen randomly. The pilot bs 
instructed to return the display elements to straigh 
and level flight as rapidly as possible following the 
rotation. 

Preceding each recovery trial, the pilot was TO 
tated to induce vertigo. Eight combinations of hea 
positions and rotation directions were used. Rota- 
tions to the right and left were given on alterna” 
2-trial intervals for each head position. The pilo 
was rotated for five turns with the head in the bee 
right position, four turns with the head on the righi 
or left shoulder, and three turns with the head bent 
forward. Rotation speed was approximately 180 
per second. t 

The order of display presentation during the ee 
period was counterbalanced in order to eliminate ; 
bias of sequential effects of learning and semicircu a 
stimulation. a 

Recovery performance in pitch and roll a 
corded on a two-channel Brush recorder. wee 
these records, total time to recover and revers 


> 
(o) 


\ —— ROLL 
a= PITCH 


MOVING \ 
HORIZON \ 


w 
Oo 
Y 


nm 
(e) 


PERCENT REVERSAL ERRORS 
ro) 


TS St Se a 
I 2 3 4 5 e 


ATTITUDE RECOVERY TESTS 


Fic: 1, Per cent reversal errors. 


136 


| 
| 


Recovery from Unusual Aircraft Attitudes 137 


Table 1 


Comparison of Reversal Errors During Recovery 
with the Moving Drone and Moving 
Horizon Attitude Indicators 


o 
Q 
t— 


MOVING HORIZON 


RECOVERY TIME (SEC) 
> 


ee 30 
cme ee 
1-2 Roll 14 6 058 i 
Pitch 21 2 .001 i H 3 ee ee 
5 6 
~ 4d Roll 6 0 016 ATTITUDE RECOVERY TESTS 
Pitch 12 0 .001 Fic. 2. Average time to recover from unusual 
i ath i : ka attitudes. 
ie 
Pitch 13 0 001 the overall superiority is in favor of the mov- 


NO E E ee . 7 
ing drone instrument. Improvement in re- 
covery time is also a significant effect (see 


eae in pitch and roll were obtained. Time scores Table 2) 
fe ere subjected to an analysis of variance. Reversal j 


os were analyzed for pitch and roll separately, i . i 
Gi be the binomial expansion to determine the Discussion and Conclusions 
‘si ability that the observed differences between the That training with the Jeni i 
| vo indicators occurred by chance. > fik ung ie > anes e-in display 
| for attitude indication did not interfere with 
performance on the inside-out display is 
Results = a dt : 
clearly indicated by the results of this study. 


In addition, recovery performance with the 
moving drone display under the influence of 
vertigo was superior to the moving horizon 
indicator on the very first day of the experi- 
ment when neither pilot had previously ex- 
perienced the former instrument. Improve- 
ment in recovery performance was observed 


Figure 1 shows the percentage of reversal 

> for both indicators in pitch and roll. 

k a9 ne that reversal errors were absent 

; nap 1 moving drone display after the sec- 

i (fen experimental session. Improvement mm 

the very performance is also observed with 
moving horizon indicator for the roll di- 

mension. Recovery in pitch, however, Was with both indicators from the first to the 

hot significantly different from the first to the sixth experimental session. From this result, 

it may be speculated that the estimated acci- 


last day. The total number of reversal errors ; 
dent rate caused by pilot disorientation due 


for the outside-in display was eight, as com- 

„œ  Pared to 67 for the inside-out presentation to vertigo can be minimized if training pro- 
(see Table 1). 

i Recovery time data are given in Fig. 2. 
mprovement is noted with both displays, but 


cedures for pilots included recovery experi- 
ence under induced vertigo similar to pro- 
cedures of this study. 


Table 2 
Summary of Analysis of Variance for Recovery Time Data (2 pilots) 
—————— oe Oo coamo 
A 4 
¥ Source df Sums of Squares Mean Squares F p 
Indicators (1) 1 18.2659 18.2659 9.158 p < .01 
Sessions (S) 5 353.6962 70.7392 35.465 p < .001 
(Cells) (11) (403.5252) 
LNS 5 31.5631 6.3126 3.165 p < .01 
Within cells 372 741.9825 1.9946 
b> Total 383 


138 


It can be concluded that (a) the outside-in 
presentation leads to fewer misinterpretations 
of aircraft attitudes than the conventional 
display and (b) negative transfer effects be- 
tween instruments are either absent or negli- 
gible. 


Received August 15, 1957. 
Early Publication. 


References 


1. Chapanis, A., Garner, W. R, & Morgan, C. T. 
Applied experimental psychology; human fac- 


J. E. Conklin and O. H. Lindquist 


tors in engineering design. New York: Wiley, 
1949. 

2. Fitts, P. M. Engineering psychology and equip- 
ment design. In S. S. Stevens (Ed.), Hand- 
book of experimental psychology. New York: 
Wiley, 1951. Pp. 1287-1340. 

3. Loucks, R. B. An experimental evaluation of the 
interpretability of various types of aircraft 
attitude indicators. In P. M. Fitts (Ed.), 
Psychological research on equipment design. 
U. S. Government Printing Office, 1947. Pp. 
111-135. 

4. Wendt, G. R. Vestibular function. In S. S. 
Stevens (Ed.), Handbook of experimental 
psychology. New York: Wiley, 1951. Pp- 
1191-1223. 


a 


| 
| 


t 


Journal oj Applied Psychology 
58 j 


Vol. 42, No. 2, 19 


A Simple Method of Recording Paired Comparisons 


Edward N. Hay 


Philadelphia, Pennsylvania 


There are many situations in which the ad- 
vantages of paired comparison outweigh the 
disadvantage of the greater amount of labor 
required. Arranging data in rank order, for 
example, is not laborious, but the ranks as- 
signed to items that are of the same or nearly 
the same value are not always easy to decide, 
because a number of items must be compared 
simultaneously or compared in turn, with 
check and recheck. It is like juggling five or 
six Indian Clubs at once. 

Paired comparison, however, reduces the 
Process to a series of simple judgments of 
one item against another, with never more 
than two things involved in each comparison. 
There are, of course, more comparisons to 
make, but all except a few can be quickly de- 
cided. Only when each item of a pair is of 


nearly the same value is any great delibera- 
tion necessary. This simplification of judg- 
ments usually promotes greater accuracy. 

However, there are many judgments to re- 
cord and summarize, which is inconvenient. 
This paper describes a way of recording the 
comparisons which is easy to do and which 
provides a clear and permanent record. It 
also furnishes a ready means of checking the 
arithmetical accuracy of the work. 

Figure 1 illustrates a simple way of record- 
ing the 153 comparisons necessary to deter- 
mine the order of rank of 18 items. The data 
relate to the amount of “know-how” required 
to perform acceptably the duties of 18 jobs 
which were being evaluated according to the 
Guide Chart-Profile Method (1). 

In Fig. 1, it can be seen that there are two 


E 
a Š ae 2 E g 
2 £24 Sa § zg 4 2 
2 ace ETS aS rat ia 
Fabi fubufn 22 8 A 
Job Title 3 spg 2808 Es PÖgÖxu E RES 
an E TE CO JES 48 O80 # y ae 
eV E2082 pHE8 3a ziela iT is 
; S642947 pasad 53585235 2% 
A 32s 2558 § Og ¢ OSb3 45g 2 
4 BuF 2835 9 woe ua pees Pee be 2 
4 ey steed segs SHREZ ES EE 5 
- boiasga pide isaséa ae 
84-5 Amortizati k 1 1 2 
tization Schedule Clerk No- 2 Mil 
1236 Letter Writer (IDS and ISA) Shit AT ie tS ep 
eee Traveling Mortgage Auditor 4 Hh a A A a a aa z 
E Manager—Certificate Accounting 1 CEEA A A T m 
E Valuation Accountant 1 mA 4 À A 1 : i i 0 
saa Anniversary Clerk—Calculating . . 1 Hill ty aed te 5 $ 
S Senior Clerk-Certificate Predetermining 1 1 /I iil i 7 4 
Bea Loan Notice Typist . MI mi n z 
13 0 Posting Machine Operator 1 i ei i : 
Bi No. 2 Balancer 1 : pe Mit D 
~12 Ledger Card Puller 
We Supervisor-Certificate Ledger Files 1 1 : i 1 i i A Mt 1 HiIl i i i y 
Geis Supervisor—General Ledger 1 1 1 i ‘ill 3 
Hig pone Cer i a ieee 7 
162-1 Manager Office Services tL, guid a ioe te 
TE Supervisor—Purchasing WE Fs PETC i 11 i WA 15 
-3 Transcriber (Typist) 1 11 iff anes 
153 


Points 


Fic. 1. Investors Diversified Sı 


ervices, Inc. 


Comparison of key jobs. 


139 


140 


cells in either of which the result of compar- 
ing any two jobs can be entered by tally. 
The tally is to be entered, in the case of each 
pair, in the line opposite the name of the job 
thought to rank highest. For example, the 
first two jobs in the table are No. 84-5 Amor- 
tization Schedule Clerk No. 2 and No. 93-6 
Letter Writer. After comparing these jobs, it 
was clear that the Letter Writer required the 
greater know-how, so a tally was placed on 
the line opposite that job name in the column 
under the other job, Amortization Schedule 
Clerk No. 2. 

From this description it can be seen, after 
all 153 comparisons have been made, that the 
job having the greatest number of tallies on 
the horizontal line to its right will be the 
highest ranking job of the group of 18 jobs. 
If any two or more jobs have the same num- 


Edward N. Hay 


ber of tallies then they can be considered to 
have been ranked equally. 

In the case illustrated in Fig. 1 there were 
8 judges. Their opinions were pooled by 
adding all their tallies, and the final rank of 
jobs was taken from those totals. 

The only check for accuracy that is nec- 
essary is to be sure that the grand total 
of all tallies is 153. For any number of 
items this total is found from the formula 
[N(N —1)/2]. In the case at hand, N = 
18 and therefore (N — 1) = 17 and (18 X 
17)/2 is 153. 


Received September 19, 1957. 
Early Publication. 


Reference 


1. Hay, E. N., & Purves, D. A new method of job 
evaluation: the guide-chart profile method. 
Personnel, 1954, 32, 72-80. 


Paia 


Journal of Applied Psychology 


VoL. 42, No. 4 


AUGUST, 1958 


The Relative Susceptibility of Two Rating Scales to 
Disturbances Resulting from Shifts in 
Stimulus Context * 


Donald T. Campbell, William A. Hunt, and Nan A. Lewis 


Northwestern University 


Several effects typically found in psycho- 
physical judgments with the method of single 
stimuli have been demonstrated by the au- 
thors in ratings of the degree of schizophrenic 
disturbance evidenced in responses to vocabu- 
lary items (2). For example, strongly skewed 
contexts produce a contrast effect in judg- 
ments of middle items, while sharp shifts in 
context produce a loss of discrimination. 
While such effects per se have practical value 
for the understanding and refinement of clini- 
cal judgment, they also may be used as cri- 
teria for the comparative evaluation of rat- 
ing scales, since they represent distortions of 
judgment to be avoided if possible. 

The present study compares the suscepti- 
bility to such distortions of two types of rat- 
mg scales—one a simple, numerical, nine- 
Point scale called the “Simple,” the other, the 

Detailed,” a nine-point scale on which each 
numerical point is provided with a verbal de- 
scription, 


Method 
cted pri- 


poe stimuli were vocabulary responses seles 

rily from those scaled by Arnhoff (1), extended 

zu Sae addition of some normal responses as previ- 
SIY reported (2). Sample items with their values 
a nine-point scale follow: 


1. Gamble: to take a chance, a risk. 
- Fur: all of an animal’s coat. 
Rim: outside diameter with a margin. 
nvelope: something you put it in for them. 


roject subsidized 


1 Thi ý 
by his study is part of a larger p ject ct Tonr- 
T 


the Office of Naval Research unde! 


450 : : 
i meld) with Northwestern University. The opin- 
tho expressed here are those of the individual au- 

rs licy of 


i a do not represent the opinions or po 
al service. 


epee 


à 213 
Research 


eau Psy 
Ednl. S- agiaeGe 


9. Stave: that’s before, that’s long before not hap- 
piness. 


The stimuli were presented in booklet form with 
five items to a page, as indicated in Table 1. For 
the “Low-High” condition, each of the first 10 pages 
constituting the initial phase contained five items 
representing scale values from 1 to 5. The order of 
scale values was different for each page as deter- 
mined by a pair of balanced latin squares. The next 
six pages provided a gradual shift to higher scale 
values, with the last 10 pages each containing stimuli 
of Values 5 to 9. The “High-Low” condition used 
the same 26 pages in reverse order. Items of scale 
Value 5 are thus present in both low and high con- 
texts and provide the common denominator for 
evaluating shifts or disturbances in judgment. Full 
details of the counterbalanced design employed and 
the guarantees of equivalence of items of Value 5 in 
both low and high contexts are found in the previ- 
ous study (2). Suffice jt to say that two versions of 
each page were prepared for the initial and terminal 
phase, differing in the specific item of Value 5 em- 
ployed and so counterbalanced that while no judge 
rated the same item twice, the same specific items 
were judged under all conditions; and that within 
the initial and terminal phases, 10 different orders of 
page assembly were employed, following another 
balanced latin square. In all, 80 different types of 
booklet were employed, two of each being used. 

For the Simple form, the instructions to the judges 


were as follows: 


On the pages that follow you will be shown defi- 
nitions of vocabulary words made by both nor- 
mal and schizophrenic individuals. Your task is 
to rate each one of these definitions according to 
the degree of organization and eccentricity which 
you think is present. You are to make your rat- 
ings on a nine (9) point scale which ranges from 
“well-organized and normal” to “totally disorgan- 
ized and eccentric.” The category one (1) should 
represent the most organized and normal defini- 
tions. The category nine (9) should represent the 
maximal amount of disorganization and eccen- 


214 


tricity as found in schizophrenic thinking. The 
intermediate categories should indicate amounts in 
accordance with their numerical value. You will 
simply write the number which you think best 
depicts each definition at the beginning of each 
statement. Be sure that you do not skip any defi- 
nition, and rate each one as you come to it. 


It is very important that you do not rate defini- 
tions according to how intelligent you think the 
person was who made the statement. Hence, even 
though a definition is incorrect, if it is in no way 
eccentric or disorganized, it should be rated to- 
ward the low end of the scale. On the other hand, 
if it shows signs of disorganization or eccentricity, 
even though fairly accurate, it should be rated to- 
ward the high end of the scale. Try to be as dis- 
criminating as possible. 


For the Detailed form, where each scale point was 
verbally defined, the instructions to the Ss were as 
follows: 


On the pages that follow, you will be shown defi- 
nitions of vocabulary words made by both normal 


D. T. Campbell, W. A. Hunt, and N. A. Lewis 


and schizophrenic individuals. Your task is to 
rate each one of these definitions according to the 
degree of organization and eccentricity which you 
think is present. The scale provided below is to 
be used in making your ratings. You will simply 
write the number which you think best depicts 
each definition at the beginning of each statement. 
Be sure that you do not skip any definition, and 
rate each one as you come to it. 


It is very important that you do not rate defini- 
tions according to how intelligent you think the 
person was who made the statement. Hence, even 
though a definition is incorrect, if it is in no way 
eccentric or disorganized, it should be rated to- 
ward the low end of the scale. On the other hand, 
if it shows signs of disorganization or eccentricity, 
even though fairly accurate, it should be rated to- 
ward the high end of the scale. Try to be as dis- 
criminating as possible. 


1. A very normal, well-organized definition. 

2. A fairly normal, well-organized definition. 

3. Very slight traces of disorganization and €c- 
centricity are present. 


Table 1 
Design of the Experimental Booklets 


Scale Values 


Low-High High-Low 
Groups Page fe@o@s 6 * F g Page Groups 
1 2m ee x 26 
2 xs << = = = 25 
3 <x we 24 
4 Ss es ms 23 
Initial J = = = x x 22 Terminal 
Phase 6 me e E E 21 Phase 
ih a x £ = & 20 
8 xt ex x 19 
9 Xie Me oe ee 18 
10 Se OY Ms 17 
11 ee or E 16 
S 12 Roy xX x x 15 
Transition 13 sc x x x 14 Transition 
Phase 14 wot X & z 13 Phase 
15 Ri gett E E oe 12 
16 <x ee ££ & 11 
17 See & & 10 
18 Se KX & & 9 
19 xx x kK ë 8 
|: 20 E oes ee de 7 
Terminal 21 zos až F Š 6 Initial 
Phase 22 x ee x os 5 Phase 
23 = Be k 4 
24 a Se x x 3 
25 S Be xe g 2 
26 a ic GR e it 


j 


Two Rating Scales 


215 


Table 2 


Mean Ratings of the Item of Value 5 on the Last Page of the Initial Phase 
(N = 40 for Each Mean) 


High Context 
(High-Low Group) 


Low Context 
(Low-High Group) 


Mean o Mean o t $ 
Simple 3.40 5.88 6.80 3.56 7.00 .001 
Detailed 3.77 4.78 4.95 3.80 2.55 01 


4. Distinct traces of disorganization and eccen- 


tricity are shown. 
5. Obvious signs of disorganization and eccen- 
tricity are present. 
6. Very eccentric and disorganized, but still show- 
ing signs of contact with reality. 
7. Very eccentric and disorganized, showing only 
a thin thread of coherence. 
8. Extremely disorganized and eccentric. 
9. Totally disorganized, eccentric, and out of con- 
tact with reality. 
i The Ss were 160 undergraduate students in an in- 
roductory psychology course at Northwestern Uni- 
versity, allowing 40 Ss for each of the four condi- 
tions provided for by using each of the two sets of 
pene with both Low-High and High-Low con- 
Sa changes. Sampling equivalence for the groups 
di achieved by randomizing the booklets before 
stribution to the Ss. 


Results 


Difierences between groups exposed to dif- 
{ erent stimulus contexts. We can examine the 
biasing effects of stimulus context by compar- 
ìng ratings of our middle value stimuli as 
given by the High-Low and Low-High groups 
a the end of the initial context (p. 10 in the 
oe of stimuli). We should expect a con- 
trast effect, with the middle value items be- 
ad rated as less disturbed when they appear 

a context of highly disturbed responses 
High-Low group) and more disturbed when 
Presented in a context of less disturbed re- 
penses (Low-High group)- As Table 2 
ows, the contrast effect appears clearly for 
ae ratings forms. However, it is much less 
x ae for the Detailed form. Since the 
is mber of cases involved in each comparison 
uel we can make a direct comparison 
eet t ratios. The significance of the dif- 
hi es between these ¢ ratios is in itself 
ghly significant (¢ = 3.14, p < -002), con- 


firming the greater susceptibility of the Sim- 
ple form to distortions produced by stimulus 
context. As such distortions of judgment are 
undesirable, we may call the Detailed rating 
form superior in this regard. Note that apart 
from response to distorting context, the two 
scales would probably produce different means 
and standard deviations. The comparison of 
t ratios enables us to compare the degree of 
distortion independent of these differences. 
Shifts in judgment produced by reversal of 
stimulus context. Here we are interested in 
the changes produced by reversing the con- 
text from low to high or high to low as the 
experiment proceeds. This involves a com- 
parison of the ratings of the “5” items in the 
initial context (pp. 1-10 of the booklet) with 
those in the final context (pp- 17-26). Where 
our previous analysis was in terms of groups, 
we will now analyze our data in terms of indi- 
vidual Ss. This is necessitated by the fact 
that whereas contrast phenomena are strongly 
manifested in the comparison of initial con- 
texts, reversing the context in a single experi- 
mental session has a mixed effect, producing 
contrast effects in some Ss, but assimilation 
effects in others (2). Thus some Ss showed 
the expected effect of contrast, and when the 
general context of items moved from low to 
high, for example, their judgments of the 
«5%? became lower than they had been. But 
for the assimilators, the shift of context from 
low to high resulted in rating the “5’s” higher 
than they had been rated before. Thus for 
some Ss, the value of the middle items 
changed toward the new context rather than 
away from it as contrast would demand. Since 
“assimilators” and “contrastors” move in op- 


posite directions, they tend to cancel one an- 


216 


other in any group treatment and the data 
must be analyzed in terms of changes in judg- 
ment for single individuals. 

To make this evaluation of individual shifts 
maximally independent of the judgmental 
idiosyncracies of individual judges we com- 
puted separately for each judge a t ratio be- 
tween his judgments of the ten “5” items in 
the initial context (pp. 1-10) and the ten 
“S’s” of the terminal context (pp. 17-26). 
Each of these ¢ ratios was then used as a 
score for an individual judge, representing his 
inconstancy in the face of the shifting context. 
Since the shifts take place in two directions, 
we have used minus signs to designate assimi- 
lation errors and plus signs to indicate con- 
trast. For the Simple rating form the ?#’s 
ranged from — 7.22 to + 7.04, and for the 
Detailed form from — 8.96 to + 6.50. These 
ts show surprisingly large values considering 
that they are based upon an N of only 20 
items. 

To analyze the data for degree of shift, dis- 
regarding its direction, we have compared the 
two rating forms in terms of the magnitude of 
the ¢ ratios, disregarding sign. There are 
slight trends in both the Low-High and High- 
Low groups for the smaller #’s to be found 
with judges using the Simple form. These 
trends do not reach significance, however, and 
when Low-High and High-Low groups are 
pooled, the ¢ ratio of the Simple and Detailed 
form comparison (using the individual ¢ ratios 
as scores) is 1.30, giving a p value less than 
19. Thus, while shifts in judgment certainly 
occurred as a result of shifting the stimulus 
context during the experiment, the two rating 
forms show no significant differences in the 
magnitude of these shifts. 

If we examine the plus (contrast) and 
minus (assimilation) ¢’s the differences in 
sign can tell us whether either rating form 
favors the phenomena of contrast or of as- 
similation. The 80 Ss using the Simple form 
separate into 47 contrastors and 33 assimila- 
tors. This trend is reversed for the Detailed 
form which offers 29 contrastors, 47 assimila- 
tors, and 4 Ss showing neither. The differ- 
ence, however, is not significant, an overall ¢ 
ratio employing as scores the individual t’s 
with their signs regarded being only .88. 


D. T. Campbell, W. A. Hunt, and N. A. Lewis 


Table 3 


Mean 7’s Between Experimental Values and 
Previous Standardization Values 


Initial Phase 
pp. 1-10 


Terminal Phase 
pp. 17-26 


Detailed .655 Detailed .537 


Low Context 


Simple .655 Simple .575 
ii L-H HL 
Hig} Detailed .642 Detailed .557 
Heh Context Simple .528 Simple .454 
H-L LH 


Accuracy of discrimination. An obviously 
important characteristic of any scale is the 
accuracy of the ratings it yields. Each judge 
was given two accuracy scores, one for his 
judgments of the 50 items of the initial phase, 
one for his judgments of the 50 items of the 
terminal phase. Correlation coefficients be- 
tween his ratings of the 50 items and theif 
standardization values constituted these scores: 
The coefficients were transformed into Z scores 
to provide normally distributed individual ma 
dices for testing the significance of the differ- 
ences between the experimental groups. The 
mean values, retransformed into 7’s are show? 
in Table 3. Our first comparison for accuracy 
concerned the initial phase (pp. 1-10). TA 
the Low-High groups the mean r was .655 i 
the Simple rating form and .655 for the De 
tailed. For the High-Low groups the zon 
parable mean 7’s were .528 and .642. The t 
ratio of this last difference is 5.24, significai 
beyond the .0001 level, indicating that ge 
the two forms are equal when used with a 
context items, the Detailed form is supe 
when used with high context items. rae 
findings appear when the coefficients for a 
terminal phase are inspected. For the Lo 
High groups (now judging items in 2 a | 
context) the mean 7’s were .454 for the re ; 
ple form and .557 for the Detailed, £ = ad 
$ < 0006. For the High-Low groups | (a nt 
judging items in a low context) no significa s 
differences were found between the two se 
(values of .575 and .537). We can conclue® 
that the Detailed form provides more @ 


= n° 
curacy when used in a high disturbance co 
text. 


Two Rating Scales 


It also seemed appropriate to get an overall 
measure of accuracy by correlating the judges’ 
ratings on all 100 items with the standardiza- 
tion values. The judges using the Detailed 
form performed better than those using the 
Simple form, although the difference was not 
as significant as some reported above. The 
mean 7’s were .764 for the Detailed form and 
-700 for the Simple, ¢ = 2.26, p < .024. 

Loss of refinement of discrimination with 
shift in context. One of the most traditional 
findings when using the method of single 
stimuli is the loss of refinement of discrimina- 
tion which occurs with any drastic shift in 
the stimulus range such as is produced by the 
Introduction of an extreme anchor or by 
shifts in context such as those in the present 
experiment. This loss in discrimination is 
illustrated in Table 3 which presents the 
mean 7 values in terms of a 2 X 2 latin 
Square. Note that each of the four values to 
the right is lower than the corresponding one 
on the left. This illustrates the drop in ac- 
curacy in the terminal phase following the ex- 
treme shift in context. A control seems to be 
lacking here since the shift in context is con- 
founded with possible effects such as fatigue. 
A careful, page-by-page analysis of the data 
T our previous experiment (2), however, 

owed no general drop within each phase; 
rather, the loss in discrimination was clearly 
concentrated at the point of transition be- 
tween contexts. 

. To evaluate the significance of the trends 
în Table 3 the data have been analyzed as a 
pour design (3). The effect of phase is 
elity significant with an F ratio of 17.56 for 

e Simple form and 36.85 for the Detailed. 

hus the Detailed form seems to show the 
catet drop in accuracy of discrimination. 

> Part, this higher F ratio for the Detailed 
h is attributable to a slightly smaller error 
tm, but it is also attributable to a larger 
wn Square for phases in the analysis of vari- 
ce. The significance of this finding is hard 


217 


to evaluate and its interpretation is complex. 
It probably should not be taken as a sign of 
inferiority for the Detailed form, as even with 
the greater drop, the Detailed form averages 
somewhat superior to the Simple form for the 
terminal phase, as is indicated in Table 3. 


Summary 


This study has compared two rating scales 
in terms of their resistance to the distorting 
effects produced by limited and shifting con- 
texts of stimulus materials. The assignment 
called for rating the degree of schizophrenic 
disturbance shown in definitions of words. 
One nine-point scale provided a minimum of 
descriptive material, while the other provided 
a verbal characterization for each of the nine 
scale values. The distorting effects examined 
were of two general kinds: shifting in the 
value of common stimuli as a function of con- 
text, and the loss of refinement or correla- 
tional accuracy. While in the subtler details 
the overall picture is complex, in general, the 
detailed rating scale has shown itself to be 
superior. It provides more nearly equivalent 
judgments from comparable groups of raters 
judging common items in disparate stimula- 
tion contexts. For stimulus materials in the 
high disturbance range, it provides a greater 
correlational accuracy. 

More generally, the approach suggests the 
utility of creating experimental stress tests in 
the evaluation of rating scales and other judg- 


mental procedures. 
Received September 26, 1957. 


References 


1. Arnhoff, F. N. Some factors influencing the un- 
reliability of clinical judgments. J. clin. Psy- 
chol., 1954, 10, 272-275. 

2. Campbell, D. T., Hunt, W. A„ & Lewis, N. A. 
The effects of assimilation and contrast in 
judgments of clinical materials. Amer. J. 
Psychol., 1957, 70, 347-360. 

3. Cochran, W. G., & Cox, G. M. Experimental de- 
signs. New York: Wiley, 1950. 


al of Applied Psychology 
eas Rows 1958 


A Factor Analysis of Variables Related to Driver Training * 


Andrew L. Comrey 


University of California, Los Angeles 


A course given in the Los Angeles High 
Schools called “Driver Training” is devoted 
to observation, instruction, and actual behind- 
the-wheel practice. Each student electing 
this course is required to fill out a card giv- 
ing certain biographical and other data. In- 
structors add information to these cards con- 
cerning the students’ performance. There 
were 1491 such students during the spring 
semester of 1954, a period chosen for study 
because it was long enough after the initia- 
tion of the program for record keeping to 
have become standardized and long enough 
ago to make it possible for the individuals in- 
volved to have accumulated a driver record. 

The names of these 1491 individuals were 
sent to the California State Department of 
Motor Vehicles to obtain the record of their 
subsequent accidents 
to February, 1957, 
be located for 373 
two additional cases were dropped at random 
to make the total nu 


the demands of the 


on available on each 
very limited, Thirty- 


time consuming for the DEE difficult and 
icles, 


218 


six variables were extracted from the school 
and department of motor vehicle records, 
however, which seemed to offer the possi- 
bility of sufficient information to justify Ke 
analysis. Some of these variables were aval 
able in continuous form, such as “height, 
and others were available in only dichoto- 
mous form, e.g., “sex,” but all variables in 
the analysis were reduced to dichotomous 
form before computing phi coefficients, we, 
correlating each variable of the 36 with eac 
other. Two matrices were obtained, one for 
each sample. 1 

The variables used are listed in Table d 
The motor vehicle data were classified kr 
grouped into categories, forming Variables 
1 through 8. A frequently occurring A 
fraction could be treated separately, but sê 
dom invoked infractions had to be combine 
with others of a somewhat similar typa 
Variable 2, for example, includes setra 
kinds of violations, but mostly traffic lig j 
and boulevard stop violations. Unsafe ara 
ing, or “moving,” violations other than mon 
in Variables 2 and 4 were grouped in be 
able 3. Variable 11 concerns a differen 
in address between high school and depar 
ment of motor vehicles records. 4 
12 through 18 were included to test the a 
sociation of geographical location with E 
other variables. Schools were grouped A 
cording to area of the city. Variable 19 a 
also a school variable, indicating attendat y 
at one of three special schools with re 
problem students, For Variable 21 and ore 
Separate medians for girls and boys whe 
taken, using small random samples from ~ 
total group. Dichotomization was carried 3 
with respect to these rough measures of ia 
tral tendency, Variable 31 refers to the ee 
ber of hours the student spent in an ne A 
tion car as an observer, For each varia! a 
the positive side of the dichotomy 15 
scribed or listed first. 


i ted 
Eighteen centroid factors were extrat 


— 


Variables — 


4 


Factor Analysis of Variables Related to Driver Training. 


Table 1 
A Summary of the Principal Rotated Factor Results 


Set. of i was indepen 
i LL of 18 centroid factors 


? All the calculations for this study were carried 
gut on SWAC, an electronic computer operated Py 
Umerical Analysis Research at the University f 
alifornia, Los Angeles, and supported by the Ze 
th aval Research. The opinions expressed a ae 
e ‘author’s and do not necessarily represent tho d 
of the US Navy. The complete correlation, ae 
And rotated factor tables have been deposited wi 


Variable p A B c i E TIG 
1. One or more nonmoving violations 62 62 
2. One or more signal violations 18 49 13 
3. Unsafe driving violations 13 42 
4. One or more speeding violations 23 7 1 
5. Two or more violations 19 91 
6. One or more accidents 25 17 
7. A violation plus an accident 14 38 8i 
8. Restrictions on driver license 15 
9. School Grade 12 status vs. others 36 80 
10. Los Angeles address vs. others 49 3 
11. Change of address after school 34 
12. Valley area schools 31 
13. Harbor area schools 10 
14. Eastern area schools 10 12 
15. Southern area schools 08 
16. Metropolitan area schools 16 
17. Western area schools 11 —93 
18. Hollywood area schools 10 -a5 24 
19. Special schools 04 tS 18 -17 16 
20. Sex (male vs. female) 4940 
62 
21. Height (above Mdn for own sex) 60 79 
22. Age (17 or more) ao 9 66 
23. Weight (above Mdn for own sex) 46 60 
24. Eye color (dark vs. others) 46 52 
25. Hair color (dark vs. others) 81 
` 26. Father has a business address 51 
27. Father has a business phone 07 
28. Mother has a business address ie 
29, Mother has a business phone e 
30. Both parents are living 69 
31. 12 or more observation hours 5 62 
32. Driver Training Grade of A PA 72 
33. Driver Training Grade of A or B af —33 
34. 50 or more class instruction hours = 76 
35. Three or more students in car 5 
36. Unfavorable instructor remarks 31 
Note.—Decimal points have been omitted. 
i i iser’s orthogonal 
fro j f nts? Each rotated analytically using Kaiser’s or 
m each matrix of phi coefficients dently Varimax Method (1). This procedure pro- 


erican Documentation Institute. Order Docu- 
e A 5595 from the ADI, Auxiliary Publications 
Project, Photoduplication Service, Library of Con- 
gress, Washington 25, D. C., remitting in advance 
$1.25 for 35 mm. microfilm or $1.25 for photocopies 
readable without optical aid. Make check payable 
to Chief, Photoduplication Service, Library of Con- 


gress. 


220 


vides a solution approximating simple struc- 
ture by maximizing the variance of the 
squared extended vector projections over all 
possible pairs of factors. Iteration proceeds 
until an acceptable degree of convergence oc- 
curs. A good simple structure was obtained 
in both analyses. 


Results 


Of the 18 factors extracted in each analy- 
sis, 16 were sufficiently identical to warrant 
being called the same factor. Two in each 
analysis failed to match. Since the results 
for the two analyses were so similar, figures 
are given in Table 1 only for the first analy- 
sis in order to conserve space. For the fig- 
ures given in Table 1, the corresponding 
values for the second analysis agree within 
-05, except in the following cases: Factor A, 
Variables 2 and 3 were .41 and -58; Factor B, 
Variable 4 was .24; Factor D, Variable 18 
was — .12; Factor E, Variable 19 was — 25: 
and Factor G, Variable 19 was .14. The cri- 
terion for including a figure in Table 1, rather 
than leaving a blank space, was that the load- 
ing had to be .1 or more in both analyses, or 
-2 or more in either analysis. e 

Of the 18 factors extracted and rotated, 
only the main seven, Columns A through G, 
will be given in Table 1. The column headed 
by “p” gives the proportion of cases above 
the dichotomy point. Eight additional fac- 
tors were of some size, but six of these proved 
to be determined almost exclusively by the 
geographical area school variables, 11 through 
18, and Variable 10. Since the division of the 
students into geographical school areas intro- 
duced artificial interdependencies, and, hence, 
spurious correlations between these variables, 
it was necessary that several such factors 
emerge. Their appearance in no way distorts 
the principal factors, however, since a suffi- 
cient number of factors was extracted. Two 
other factors were confined to the Variables 
26 through 30. These factors failed to have 
major loadings for variables from the other 
sectors of interest. The remaining three fac- 
tors were very minor, two failing to be 
matched in the two analyses. 

The seven factors given in Table 1 are 
readily interpretable as: A. Traffic Law Viola- 


Andrew L. Comrey 


tion, B. Accidents, C. Age, D. Course Grades, 
E. Greater Car Use, F. Physical Size, and G. 
Dark Coloring. A striking feature of the re- 
sults is the relative independence of the tend- 
encies to have accidents and to receive traffic 
citations. The major factor representing most 
of the variance for the citation variables had 
no loading of any importance for the accident 
variable. There was a sizeable loading for 
the variable “one or more accidents and one 
or more violations,” but since no loading ap- 
peared for the pure accident variable, only 


the “violations” part of this complex variable 


presumably is involved. A small amount of 
the traffic citation variance did appear on the 
major factor defining accidents, however. , The 
variable “one or more speeding violations 

had loadings of .12 and .24, respectively, in 
the two analyses. This picture is supported 
by reference to the original correlations be- 
tween “one or more speeding violations” and 
“one or more accidents.” For the two analy- 
ses, these phi coefficients were .16 and .24, re- 
spectively. Although these correlations are 
both significant beyond the .01 level, the pro- 
portion of common variance indicated by 
these coefficients is less than six per cent- 
For the .particular population represented by 
these samples, therefore, it must be conclude 


that there is only a slight tendency for pera i 


sons receiving speeding citations also to have H 


had accidents. With respect to nonspeediné 
citations, there seems to be no relations 
with accidents. 


my 
; n 
There is little doubt that male driven 
this population are more likely to receive €! 


tions and to have had more accidents thay 
female drivers. It is interesting to note, bau 
ever, that the proportion of common variant 
with the sex variable is about five times # 
great for the traffic-law-violations factor as a 
the accidents factor. In fact, the less ma 
four per cent common variance between p: 
accidents factor and sex is surprising in VIe 


Sa T s ase 
of the relative insurance risks commonly-4°° _ 


signed to young male and female drivers. 
Grades in driver training courses app% 

ently have no validity for predicting who W 

receive traffic citations or have accidents 


r- 


: + å j s0, + 
Socioeconomic and geographic variables al 


i ips tO 
failed to show any important relationships t 


l 


Factor Analysis of Variables Related to Driver Training 


traffic citations or accidents. The dark color- 
ing factor failed to have any loading for the 
accident or citation variables, tending to sug- 
gest that Mexican and Negro groups are not 
markedly different from other groups in these 
respects. Students in the special-problem 
schools showed a slight tendency to have 
more citations although this same tendency 
did not appear with respect to accidents. 
Sinee these schools have many “problem” 
students, the relationship with traffic offenses 
is not surprising, except, perhaps, because it 
is-so low. The low p value of .04 may be 
partially responsible for this, however, since 
phi is dependent upon the marginal totals. 

» This study has succeeded more in showing 
what accidents are not related to than what 
they are related to. In short, with respect to 
the population considered, we cannot single 
out the traffic offender, the poor student, the 
minority group member, the male, or the 


221 


driver from a particular part of town as be- 
ing much more likely than any other ran- 
domly selected population member to have 
had an accident. Since the present popula- 
tion consists only of individuals who elected 
driver training, it cannot necessarily be as- 
sumed that these same results would hold in 
the wider population. Many of these vari- 
ables may be more highly related to accidents 
in the general population, since individuals 
electing driver training are apt to be more 
conforming and generally less characterized 
by irresponsible behavior which may lead to 
accidents. 


Received August 9, 1957. 


Reference 


1. Kaiser, H. F. The varimax criterion for analytic 
rotation in factor analysis. Psychometrika, in 


press. 


Journal of Applied Psychology 
Vol. 42, No. 4, 1958 


Effect of Time Limitation on Making Settings on a 
Linear Scale * 


David C. Greek? and Arnold M. Small, Jr. 
Lehigh University 


A number of variables have been investi- 
gated regarding cursor positioning perform- 
ance on a linear scale by means of control 
knobs (3, 4, 5). The most important vari- 
able seems to be control ratio, the ratio be- 
tween cursor movement and revolution of the 
control knob. Other variables such as knob 
diameter, friction, inertia, and backlash be- 
come important only under special conditions. 

The present study differed in two major re- 
spects from previous investigations. First, 
the time allowed an S to make a setting was 
varied systematically and, second, a measure 
of error (final discrepancy between cursor and 
target) was obtained. Varied simultaneously 
with time limitation were (a) direction of 
initial cursor displacement from target, (b) 
distance of cursor travel, and (c) control 
ratio. Both time and error were measured. 


Method 


Apparatus. From the S’s point of view the appa- 
ratus consisted of a large panel with an eye level 
rectangular hole cut from the middle. When not 
covered by a shield, the linear scale was visible 
through the cut-out. The scale itself consisted of a 
plain white card, }” by 11”, with a hairline scribed 
vertically at the center point. The cursor, a piece 
of lucite with a vertical hairline, was controlled by a 
knob, 2% in. diameter located at a convenient posi- 
tion for a seated right-handed S. 

The knob shaft was coupled to the cursor by a 
ball disk integrator (Western Electric KS-8710) and 
a magnetic clutch. The clutch was energized only 
during the trial interval. At the instant each trial 
started, the shield in front of the scale was dropped, 
the clutch energized, and two of the three timers 
(Standard Electric S-1) were started. Three time 
intervals were measured: (a) the overall time from 
trial start to finish (total time), (b) the time from 


1 This article is derived from a thesis submitted by 
the senior author to the Department of Psychology 
of Lehigh University in partial fulfillment of the re- 
quirements of the degree of Master of Arts. The au- 
thors wish to express their indebtedness to W. L. 
Jenkins. 

2 Present address: Aero Medical Laboratory, Wright 
Air Development Center, Wright-Patterson Air Force 
Base, Ohio. 


trial onset until the cursor was moved to within 
0.1 in. of the target (travel time), and (c) the time 
from the 0.1 in. position until either the S was satis- 
fied with his alignment and threw a switch stopping 
the timers or the trial was ended by the E because 
time allotted had expired (adjustment time). Time 
was measured to the nearest 0.01 sec. and error to 
the nearest 0.0025 in. 

Procedure. All 12 Ss were right-handed and all 
participated in one practice session followed by nine 
experimental sessions. Sessions lasted about 45 mins. 
and were separated by approximately 48 hrs. H 

Each S was required to make settings at 12 time 
intervals in decreasing order (4.0, 3.0, 2.6, 2.2, 1.8, 
1.6, 1.4, 1.2, 1.0, 0.8, 0.6, and 0.4 secs.). All Ss be- 
gan with the 4 sec. interval. 

During each experimental session only one control 
ratio was used, selection being determined by 4 
counterbalanced order. Two Ss were assigned to 
each of the six possible orders of the three ratios 
(1 in, 2 in, and 4 in. of pointer movement per 
revolution of knob). a: 

Sessions, then, consisted of Ss making 144 settings- 
At each of the 12 time intervals, 6 settings were 
made involving cursor displacement to the right and 
6 settings involving cursor displacement to the left. 
Short travel (15/16”) was involved in 3 of these set- 
tings and 3 involved long travel (50/16”). c 
for decreasing time and control ratio, conditions 
within any session were presented in a chance fash- 
ion. All Ss participated in all experimental condi- 
tions. 4 

Instructions. Excerpts from the Ss’ instructions 
follow. “You are going to be given a certain amoun 
of time to make a setting. After several trials this 
time interval will be reduced. Your task is to tur™ 
the knob with your right hand so that the cursor 
hair-line exactly superimposes the scale hair-line 
You are to do this as fast and as accurately as POS 
sible.” 


Results 


Error. An analysis of variance for the or 
thogonal variables of allotted-time interva' 
control ratio, distance of cursor travel, direc- 
tion of initial cursor displacement from target, 
Ss, and order of ratio presentation is summa” 
rized in Table 1. The error terms used fO" 
testing each of the main effects and interac 
tions are indicated by numerical coding in the 
extreme right and left columns of the table- 


222 


Except — 


“ 


Settings on a Linear Scale 


It will be seen that since Ss may be regarded 
as a random sample of Ss in general, those 
higher-ordered interactions containing Ss are 
the appropriate error term for lower-ordered 
interactions and main effects (1, pp. 247- 


223 


of variance was not met, the .001 level of 
confidence was chosen as suggested by Lind- 
quist’s review of the Norton study (6, pp. 
78-90). 

It will be noted that in Table 1 , five of the 
32 sources of variation account for over 95% 


252). Since the assumption of homogeneity 
Table 1 
Summary of Analysis of Variance for Variables of Time, Ratio, Distance, Direction, Subjects, and Order 
Variance Error 
Source df Estimate F Term 
i 11 106.7534 405.25* 10 
: ed T 2 17.2564 69,59* 13 
3 Distance (Ds) 1 157.2590 189.80* 15 
4 Direction (Dr) 1 2.6974 e 16 
5 S’s within order (S) 6 1.3937 27.97 36 
6 Order (0) 5 3368 = 5 
22 3.0928 62.53* 35 
i H ig 11 38.9318 157.78* 21 
9 TXDr 11 .3684 1.75 22 
10 TXS 121 .2634 5.33* 35 
2 13.9037 45.87* 24 
11 RX Ds 2 2723 5.51 35 
12 RX Dr 2 2479 5.01* 35 
13 RXS 1 1.3807 3.60 26 
14 DsX Dr 11 (8285 16.75* 35 
15 Dsxs u «4650 9.4* 35 
z pe oF fis 2 2.6342 18.51* 28 
22 -0570 1.15 34 
18 TX RX Dr 242 .0547 1.11 34 
19 TXRXS 11 1639 1.01 30 
20 TX DsX Dr 121 2467 5.02* 34 
21 TXDsXS 121 .2086 4.24 34 
22 TXDrXxS 2 2557 4.11 31 
23 RX Ds xX Dr 2 3030 6.2* 34 
24 RXDsXS 22 .0836 1.69 34 
25 RXDrXS it 3838 7.8 34 
at ote eer ie 22 .0789 1.60 33 
27 TXRXDsXDr 242 -1423 2.89* 33 
28 TXR X Ds XS- 242 .0796 1.61 33 
29 TXRXDrXS 121 «1619 3.29* 33 
30 TXDsXDrXS 22 0622 1.26 33 
3 S 
ANR Ds XDK 242 .0152 — 33 
32 TXRXDsXDrXS 
33 Gas 13,824 .0491 — — 
Within EA 350.4416 
Spe 14,352 .0492 
34 Pooled nonsig. 27-33 14651 .0495 
35 Pooled nonsig. 17-33 1 4665 .0498 
36 Pooled nonsig. 7-33 $ 


* 
001 probability. 


I" ratio 
o — short right 
e —— short left 
4 —— long right 
4 --- long left 


ERROR - INCHES 


4 0 | 


David C. Greek and Arnold M. Small, Jr. 


4’ ratio 


2 3 4 


TIME ALLOTTED - SECONDS 


Fic. 1. The relation of final error to allotted time, control ratio, and direction and magnitude of cursor 
travel. 


of the total variance. In spite of this there 
are 18 sources which prove to be significant 
at the .001 level. This is due in large meas- 
ure to the power of the error estimate, the 
within estimate containing 13,824 df. 

Figure 1 shows the mean error for all 12 
Ss for all 144 experimental conditions. Each 
data point is based upon 108 observations. 
Panels of Fig. 1, from left to right, show the 
results for the 1 in., 2 in., and 4 in. control 
ratios, respectively. Within each panel are 
four curves representing long and short travel 
distance and right and left initial cursor dis- 
placement. Although, in Table 1, Direction 
was not found to be a significant source of 
variance, when pairs of right versus left dis- 


10.0 
o— |" ratio 
e——2' ratio 
----4" ratio 
m 1.0 
x 
re) 
zZ 
I 
= 10 
cx 
Cc 
Ww 
Ol 
OO! 
o l 2 3 4 0 i 2 3 4 
TIME ALLOTTED - SECONDS 


Fic. 2. Error as a function of allotted time, contro] 
ratio, and cursor travel distance. 


placements are tested by the Sign Test a. 
pp. 547-549) the left means differed from 
the right at the .01 level. 

In order to demonstrate more clearly a 
effect of ratio, travel distance, and time ae 
lotted per setting upon error, right and le 
directions were combined. These means, pii 
based upon 216 observations, are presente 
in Fig. 2. : 

Time measures. The distribution of pa 
measures was markedly skewed, especially a 
the short allotted times. This skewness !5 r 
lated to the time pressure on the SS; ot 
often when little time was allowed, SS ule 
fail to complete their setting. Because a : 
response distribution, medians rather i 
means were computed. that 
Figure 3 is similar to Fig. 2 except rdi- 
time instead of error is plotted on the ate 
nate. Both travel and adjustment time tio; 
shown as a function of allotted time, er 
and travel distance. Total time to ™ a d 
setting may be obtained by adding trav? cipce 
adjustment times for a given condition: |, £ 
a trial could be terminated by either the 5 
(when the allotted time had expired) oy 
(when he was satisfied with his settin& erm 
Mpossible to tell from the data who al- 
nated the trial if total time equals cam? 
lotted time. Thus, if an S’s total tim? e ab 
within 0.05 sec. of the allotted tim®: 


° all 
tance. ed time, control ratio, and cursor 


Settings on a Linear Scale 225 


Justment time data for that condition were 
discarded; similarly, if travel time was within 
0.05 sec. of allotted time, it too was dis- 
carded. 

Nominally in F ig. 3 each data point rep- 
resents the median of 12 Ss, each of whom 
made 18 observations. Because of the cri- 
teria adopted regarding acceptability of the 
time scores, not all Ss are represented in each 
data point. If less than 6 Ss did not meet 
the criteria, no point was plotted; hence, in 
Fig. 3, it will be seen that points are not pre- 
Sented for all allotted time, especially for ad- 
Justment time which would be the first to 
suffer if the trial were terminated by the £. 

he sum of travel and adjustment time 
(total time) decreases markedly as allotted 
time is shortened. In addition, especially for 
long travel, the coarser ratios yielded shorter 
total times. If the discarded data had been 
Included, the plotted data points would not 
ave shifted materially and the travel time 
Curves would approach the y = x line asymp- 
fotically and pass through the origin. The 
adjustment time curves would approach the 
Y = x-travel time line and also pass through 
the origin, 


long trovel 


O 
4 
D 


short trovel 


TIME - SECONDS 


TIME ALLOTTED - SECONDS 


Fic, 3. The relation of travel and adjustment time 
travel dis- 


TIME - SECONDS 


CONTROL RATIO - INCHES /REV. 


Fic. 4. The relation of travel and adjustment time 
to allotted time, tontrol ratio, and cursor travel dis- 


tance. 


Figure 4 shows median travel and adjust- 
ment time for long and short travel, right and 
left directions, with the 1 in., 2 in., and 4 in. 
control ratios. The plotted values include all 
time intervals and Ss; and thus, each point 
represents 1296 observations. Although dif- 
ferences were small, travel time was less, 
p< .05 (2, pp. 547-549), when the initial 
cursor displacement was to the right of the 


target. 
Discussion 


Certain aspects of the present study may 
be compared to those of Jenkins and Connor 
(3). These authors found that increasing 
control ratio from 1 in. to 4 in. per revolu- 
tion did not decrease travel time appreciably. 
In the present study this is true only for the 
short travel distance. These authors also 
found that with an increase in control ratio, 
adjustment time increased. This is partially 
substantiated in the present study, but may 
be due to other factors. It may be simply 
that because the higher ratios give faster 
travel and thus allow more time for adjust- 


226 


ment within the allotted time, the S simply 
uses all the time available to him to make the 
adjustment. The results in the present study 
may be related to the interaction between 
time pressure and ratio rather than ratio 
per se. me 

This same time pressure-control ratio in- 
teraction hypothesis may serve to explain one 
of the major findings of the present study. 
When ample time is allowed, the accuracy of 
the setting is fairly independent of the con- 
trol ratio, but as the allotted time is reduced, 
especially with long travel distances, use of 
the coarse ratio clearly results in more ac- 
curate performance. That is, the coarser 
ratios yield superior accuracy under time 
pressure possibly because they effectively al- 
low more time for adjustment. 


Summary 


Performance on a linear scale as a function 
of four independent variables was investi- 
gated: (a) reduced time intervals in which 
to make a setting, (b) control ratio, (c) di- 
rection of initial cursor displacement from 
target, (d) distance of cursor travel. Twelve 
Ss participated and each was instructed to 
make settings as fast and accurately as pos- 
sible. The size of the final discrepancy be- 
tween target and cursor (error) was measured 
as was the time for travel to the approximate 


location of the target and the time for final 
adjustment. 


David C. Greek and Arnold M. Small, Jr. 


When ample time is allowed to make a 
setting, use of a relatively fine control ratio 
gives maximum accuracy; with limited time, 
a coarser control ratio gives maximum ac- 
curacy. 

The critical allotted time interval at which 
error magnitude increases rapidly is depend- 
ent upon both travel distance and control 
ratio. Reduced time to complete a setting 
may be partially compensated for by coarser 
control ratios or a reduced travel distance. 

Time taken by the S to complete a setting 
decreases with shorter allotted times and 


coarser control ratios, as well as with short 
travel distances. 


Received September 3, 1957. 


References 


1. Edwards, A. L. Experimental design in psycho- 
logical research. New York: Rinehart, 1950. 

2. Festinger, L., & Katz, D. Research methods in 
the behavioral sciences. New York: The 
Dryden Press, 1953. 

3. Jenkins, W. L., & Connor, M. B. Some design 
factors in making settings on a linear scale. 
J. appl. Psychol., 1949, 33, 395-409. 

4. Jenkins, W. L., Mass, L. O., & Olson, M. W. In- 
fluence of inertia in making settings on a linear 
scale. J. appl. Psychol., 1951, 35, 208-213. 

5. Jenkins, W. L., Mass, L. O., & Rigler, D. Influ- 
ence of friction in making settings on a linear 
scale. J. appl. Psychol., 1950, 34, 434-439. _ 

6. Lindquist, E. F. Design and analysis of experi- 
ments in psychology and education, Cam- 
bridge, Mass.: Houghton-Mifflin, 1951. 


£ 


Journal of Applied 
Val. 42, wou A ed Pevchology 


The Influence on the Results of a Conventional Personality 
Inventory by Changes in the Test Situation: A Study 
on the Humm-Wadsworth Temperament Scale 


Gudmund Smith and Sven Marke 


University of Lund, Sweden 


The Humm-Wadsworth Temperament Scale 
(2, 4, 5, 7) consists of 318 items altogether. 
The choice of “Yes” or “No” for 164 of these 
items is taken to indicate the existence of 
Seven personality components: normal (N), 
hysteroid (H), manic (M), depressive (D), 
autistic (A), paranoid (P), and epileptoid 
(Œ). Weights varying from 1 to 6 are al- 
lotted to indicative choices according to their 
diagnostic power (7). An item often belongs 
to more than one scale; several components 
may eyen share the same response alternative. 
hus the choice of “Yes” for “Do you some- 
times feel cross or grouchy without special 
reason?” adds 3 to the strength of the D com- 
Ponent and 1 to the A component. While, 
owever, N is determined mainly by negative 
Choices and E by a more equal number of 
negative and affirmative ones, maximum 
Strength for the other components is ac- 
Wired more or less exclusively by the choice 
Of “Yes,” Thus, inevitably, the number of 
negative alternatives chosen in the H-W test 
will Correlate positively with the strength of 

and negatively with the strength of H, M, 

> “\, and P, In order to make test results 
poMparable for people with differing num- 
Sers of No-responses Humm has, therefore, 
seduced a correction for No-count plus 
other Corrections diminishing the correlation 
(3, 5, 6). 

The median for No-count in Humm’s sam- 
e is around 167 (5). In a Swedish ae 
ce 78 job applicants tested at 4 factories (by 

stified Humm-testers with the authorized 
medish version of the test) the ree 
Side 22 Was between 195 and 200, i.e., ere 
by athe Tange which is considered accepta A 
devin and his associates (the wees 
Since a being approximately ne z a 
= aaah samples consisted 0 : 
About yeter to all these corrections when talking 

°-count corrections below. 


who did not depend on the test results for 
their employment as did our sample, this dif- 
ference in test situation would apparently ex- 
plain the difference in No-count. Defenders 
of the Humm test, aware of the serious im- 
plications of such an explanation, have tried 
to rejoin that a great many applicants are 
more maladjusted than people holding steady 
jobs and are thus less open to questioning and 
more apt to hide their faults and to choose 
“No” for an answer. But since we found no 
conclusive differences in our sample between 
those who became employed after testing and 
those who were discarded, we are hardly 
willing to subscribe to such a “charactero- 
logical” explanation. We do not believe 
either, as Humm obviously does, that the 
preference for normal indicators at the ex- 
pense of pathological ones which goes with a 
high No-count is only an epiphenomenon, the 
consequence of an S’s general, negativistic 
bias for the very word “No.” It seems much 
more sensible to reverse the argument and as- 
sume that people applying for a job want to 
appear as normal and desirable as possible 
and that, owing to such an attitude and the 
tendency of No-responses in this test to be 
socially more acceptable, their No-counts au- 
tomatically increase over those for people who 
are already holding a job and have nothing 
serious at stake. If “Yes” was generally the 
more acceptable alternative it would natu- 
rally be chosen by those who now prefer 
“No.” 

The main purpose of this paper is, there- 
fore, to test the hypothesis that the results of 
the H-W inventory are sensitive to the situa- 
tion of the S (cf. also 1). The confirmation 
of such a hypothesis will naturally affect the 
use also of other personality questionnaires, 
most directly those screening tests which have 
been standardized in a situation widely dif- 
ferent from the situation for which they are 


227 


228 


recommended. More specifically, we also 
want to inquire into the concept of “response- 
bias” as described above and to control em- 
pirically if corrections based on such a ra- 
tionale can be effective. 


The First Experiment 
Subjects and Method 


This experiment should be considered preliminary 
and is in part a repetition of one performed by Giese 
and Christy and reported by Tiffin (8, pp. 170 f.). 
Twenty-six students were selected at random from 
two senior classes in a teachers’ college; 12 of them 
were men (aged 24.7 years) and 14 women (aged 
22.9 years). Two men had to leave before the sec- 
ond testing and were therefore excluded from the 
group. We chose these people as Ss because their 
level of education was about the same as for the 
large sample referred to in the introduction. More- 
over, all students were well above the age limit set 
by Humm for the use of his test. 

The E gave the usual instructions for the H-W 
questionnaire but added the following sentences: 
“We only want to test the inventory—not you or 
your personality traits. This is a purely scientific 
experiment, and all results will be treated confi- 
dentially. The college has nothing to do with this 
testing. Teachers and other people on the staff will 
have no access to your results.” We call this situa- 
tion A. When an S had completed his test the fol- 
lowing instructions were placed before him to read: 
“We are sorry that we have to ask you to fill in the 
questionnaire once again. But now you should try 
to imagine that this is an attempt to examine your 
teaching ability, i.e., your test results will be used as 
a measure of your suitability as a future teacher.” 
This was the B situation. 


Differences Caused by the Change in Situa- 
tion g 


Table 1 summarizes the differences in No- 
count between the two situations. Since 
women had less No-responses than men, we 


Gudmund Smith and Sven Marke 


Table 1 


No-Count. Differences in the First Experiment 


Comparison 
No- 
Sex Situation Count t P 
Men “Confidential” (A) 175.7 2.70 .05-.02 
(n=10) “Applicant” (B) 198.9 
Women “Confidential” (A) 155.6 2.94 .02-.01 
(n=14) “Applicant” (B) 171.7 


treated the sexes separately. It is evident 
that for both men and women the B situa- 
tion, even if its stress was not real, caused a 
significant increase in No-count. These re- 
sults may be taken as a preliminary confirma- 
tion of our hypothesis that the H-W test de- 
pends on the test situation. 

‘Two sets of results are presented in Table 2: 
(a) raw scores derived directly from an S's 
response pattern, (b) profile values corrected 
for No-count, etc., and transformed into 4 
21-point scale. While raw scores for N in- 
crease from A to B and those for other com- 
ponents decrease, this trend is reversed in N, 
H, and P as far as profile values are Con- 
cerned. Such a tendency for No-count cof 
rections to affect some component values more 
than others is also reflected in the integration 
indices (Table 3), i.e., the sum of the differ 
ences between profile values in N and each ° 
the other components. Although our Ss chose 
19 additional “No’s” in B they got an inte 
Sration index of 30.0 as against 35.3 in * 
Humm’s Corrections for an increase in s; 
count do not seem to restore profile values 
B to the same level as in A. 


jn 


Table 2 


Means of Raw Scores and Profile Values in the First E i 
xperiment 


x. Com t 
Scores Situation N ee ea 
z H M D A P E 
Raw scores “Confidential” (A) 44 
“ is. 3 
Applicant” (B) Pe ie 38.7 50.5 360 21.2 a 
80 310 340 242 188 2 
Profile values “Confidential” (A) 15.4 . 9 
“Applicant” (B) as si pid 12.5 8.2 6.5 F 
k - 11.7 6.7 9.2 r 


Table 3 


No-Counts and Integration Indices in the 
First Experiment 


E No- Integration 
Situation Count Index 
“Confidential” (A) 164.0 35.3 
“Applicant” (B) 183.0 30.0 


There are further reasons to question the 
rationale upon which No-count corrections are 
based, Correlations between raw scores in A 
and B are all positive and all except one sig- 
nificant (Table 4). Since the increase in No- 
Count varies considerably among our Ss the 
coefficients of correlation for raw scores are 

a of an expected magnitude. Profile values in 

» On the other hand, have been corrected for 

» Change in “response-bias”; and correlations 
etween profile values should thus increase as 
compared with raw score correlations. But 
“te is no such trend in our results. 
b Tn comparison with the correlations found 
Y Giese and Christy (8, pp. 170 f.) our cor- 
relations are rather high, perhaps because our 
Second testing was performed immediately 
- After the first, Moreover, our sample may 
een been less willing to submit to imagina- 
ep In spite of the positive correlation co- 
iclents, however, our results reflect rather 
rm "ked changes from A to B. If the instruc- 
ia, had remained the same in B, our co- 
“lents could have been regarded as an esti- 
ation of the retest reliability. We know 
test; response patterns often differ from one 
Sting to another because the attitude of the 
y © himself and to the test changes. But 
€n a retest coefficient should exceed + .50 


The Humm-Wadsworth Temperament Scale 


229 


considerably. There is no doubt that the re- 
versal of instructions has accentuated the dif- 
ference between test profiles in A and B. 


The Second Experiment 


Subjects and Method 


This experiment represents an improvement over 
the preliminary one in that the B situation was made 
much more real. Two groups of Ss were selected 
at random from the three senior classes in another 
teachers’ college. There were 33 Ss in Group I (17 
men and 16 women) and 35 Ss in Group II (18 
men and 17 women). The average age for all of the 
four subgroups was very close to 24 years. In order 
further to control the sampling we compared the av- 
erage term characters (ranging from 1-5) in teach- 
ing, Swedish, mathematics, and athletics for the two 
groups. These means proved to be almost identical. 
The two main groups were tested at the same time 
in different classrooms. After reading the usual H-W 
instructions, Æ added the following remarks. 

Group I: See instructions for situation A above. 

Group II: “We don’t want you to remain ignorant 
of the fact that this testing may be important for 
you. It is an attempt to examine your teaching 
ability experimentally, i.e., your test results will be 
used as a measure of your suitability as future 
teachers.” 

Reactions to the introduction of the questionnaire 
differed considerably between the groups. In Group 
I, Ss worked in silence and with concentration and 
finished their task quickly. Group II received the 
instructions with dissatisfied murmurs, exchanged 
meaningful glances and were only too eager to criti- 
cize the questions. Some of the Ss even wanted to 
leave in protest but were prevented by the E. They 
finished the test about half an hour later than Group 
I, obviously very tired and anxious, 


Differences in No-count and Component 


Values i 
Table 5 includes a series of comparisons be- 
tween the two groups. The number of No- 


Table 4 
Correlations Between Situations A and B 
Component 
t = / P E 

> Scores pee N H M D A 
F 57 +.54 +52 +.57 +65 +.46 
pen ati A T Ta <01  <.02 <.01 <.001 <.05 
58 +25 +.49 

Profi > 5 +.66 +46 +43 4.5 
ofile values z Sa <.001 <02 <05 <.01 — <.02 


Bi 
To 
‘duct-moment correlation, 


Gudmund Smith and Sven Marke 


230 


w sayu OU WA SIPHO— PION 


‘QOULIGIUTIS JO PAI] OVF IP yno ZUUI sIUMYIP OU WLS 


= — THI TOT ssp FEE zo TL GeT 902 cot TYE — xəpur uone13yu] 
mm = söz M TET  SL'8 = = orz 908 oog Te MILA “Jol 
_ E LOE OSC F0'S TLT 5 a ose LS2 Sr rH 9109S MEA proydepidy 
ma = oT  Lc'9 cog r9 zwo> L¥e OET 686 POE STZ INLA “Jorg 
or> ori #8'¢ Vel cle SST zo> 692 ie sot FeO STT 3109S MEY prouritd 
aa = ooz TH iwe Iei 100'> OF coz 19'S og¢ TA anyva ‘Jorg 
= = S'O sSZ S71 CoS 100> 0S LoL 92i FLT S'O 3109S MEY onsmy 
= san LET va Sr Ta — — wi OL ii. om anya "Jold 
pa — 6ST OPE eT YTH 100°  #9E TZ vez L8t OS 3109S MEY əarssardəq 
= ss 6sz = FO jSz sol — = i ce Tr coz 6e anyea “Jorg 
S> Ni ETE CTE wr TE 10> Tre TOL T goss OTS 3109S MUN NUL 
= an oZ 919 soz  SL'9 — = sot  €8'8 cee SSL an]vAa “Jord 
co> soz 9S = TZ POT 9T o> £67 ecg Tsz zor SPE 91098 MEY pro13}S4H 
— = g Ost ose. TSi o> 6LT ca os ore 9ST angea ‘Jord 
= or 9s'¢ TE 19L OLF ore LT ros TOS geg WEF 91098 MVY pewIoN 
co> Te? L99 THS PT9 SoS t00'> 0S irr Oozs cy, 629 — qunod-9[ yor 
100'> O'1Z See Cade eze OZI -- yuno2-0N 
d 1 D ui D w g 1 D w uoruudq 
jis _ ne Ge + Seo G2 e 
‘Id quroyddy,, leuapyuoy,, Ha ,queoyddy,, «euapyuo),, 
II I II I 
UDO AY Udy 


a 


quouiadxsy puosas oy} UL JI pur I sdnoisy us9a\zog SONIVA 2 


¢ AVL 


yorg puv ‘sa10Ig MEY yun0)-ON Ul saouaIayId 


—_ 


be 


The Humm-Wadsworth Temperament Scale 231 


responses is significantly higher in Group II, 
for men as well as for women. In view of the 
fact that the group of applicants referred to 
In the introduction was predominantly mascu- 
line it is of special interest to note that the 
average No-count of men increases with 33 
Choices. The No-count in Group I is some- 
What higher than in Humm’s samples. But 
the marked differences between the two groups 
makes our introductory assumption plausible 
that the difference in No-count between our 
sample of applicants and Humm’s employed 
Samples was to a high degree due to differ- 
€nces in the general test situation. We also 
Observe that the average No-count among the 
men in Group II is about the same as for 
Male Swedish applicants. ‘These results im- 
ply a further substantiation of our basic 
Criticism against the H-W test: that it is not 
Warranted to apply norms standardized for 
employees to the results of job applicants. 
€ change in No-count is reflected in the 
taw scores, While N increases and E remains 
Telatively unaffected, raw scores for the other 
components decrease. There are more signifi- 
“ant differences for men than for women be- 
rause the change in No-count was less for the 
aeter category. The groups differ most mark- 
edly with respect to D and A. Even if N 
ght to be affected by a changed “response 
Sal about as much as A, the difference be- 
sult = T and II is hardly significant; this re- 
ti 'S In agreement with the high intersitua- 
“nal correlation reported for N in the previ- 
also Xperiment. Differences in profile values 
whi] Show about the same trends as before: 
i D thus N and A (in the male subgroup) 
eee Significantly, the average score one 
Not “ases. Humm’s corrections appar nay 9 
With - the changes in raw scores whic Ae. 
Coy Such a substantial enhancement in NO- 
ana] aS reported here; and we turn E an 
tra YSIs of individual items in order to illus- 
Spo more clearly how the excessive No-re- 
"SES are distributed over the components. 


Di 
ferences in Responses to Individual Items 


aan this analysis we did not keep men and 
Maj, _ *Part but compared only the ah 
feren Sroups. All P values reported for 1t- 

“€s between choice distributions in the 


groups were based on the Mostellar-Tukey 
graphic x° calculations. Before knowing any- 
thing about the results of the analysis, the 
present authors tried to guess which items 
should be most affected by the difference in 
test situation introduced here, i.e., we scruti- 
nized the content and formulation of each 
question, especially those with an obvious 
moralistic bias, for how much a “Yes” or 
“No” was likely to clash with the ideal image 
of a teacher. i 

Figure 1 shows the percentage of indicative 
alternatives chosen by the groups. The main 
differences between I and II are in line with 
those reported for raw scores above. But al- 
though the distribution of indicative choices 
is rather similar in H, M, D, A, and P (“Yes” 
most often indicates the existence of the typi- 
cal “disposition”) the change in their relative 
numbers from I to II is not the same. The 
decrease is most evident for A, the raw scores 
and profile values of which diminish accord- 
ingly. P, with the same percentage of indica- 
tive choices in Group I, is much less affected; 
and the obvious result is, in spite of diminish- 
ing raw scores, that profile values increase in 
Group II. This is the effect of No-count cor- 
rections when the change in No-count is re- 
lated to more change in A and less in P than 
implied in Humm’s equations. The slight in- 
crease in N also explains why profile values 
are lower in Group II. 

As many as 59 choice distributions for indi- 
vidual items changed so much that the differ- 
ence between the two groups became signifi- 
cant beyond the .05 level. The present au- 


J 
70- 


60- 
50- 


30- 
20- 
10- 
fil IQ 1 0 IQ ie || I D 
M D A P E 


0- IE I 

N H 

Fic. 1. Percentages of indicative alternatives chosen 
i by Groups I and II. 


> 


232 


thors picked out 74 items which they believed 
would be especially sensitive to the difference 
in test situation. If the 59 significant items 
were randomly distributed over the entire 
scale 13.7 would fall among the 74 items just 
mentioned. But our guessing, even for the 
direction of change, was correct in 29 cases. 
The probability of arriving at this number of 
correct guesses was less than .01 (x* = 6.73). 
This result tends to support our hypothesis 
that the “extra” No-responses in Group II 
concern such items for which a “Yes” would 
be socially inopportune. 

If the difference in instructions had been of 
no importance we might have expected a sig- 
nificant change (P <.05) in 15.9 items out 
of 318, or, in 8.2 of the 164 indicative items. 
Consequently, there can be no doubt that the 
test situation influenced response patterns. 
But the 59 (36 for indicative items) signifi- 
cant differences between the groups do not 
imply only that Group II tried to avoid af- 
firmative answers and thus proved to be sensi- 
tive to situational stress. If the additional 
No-responses had been randomly distributed 
over all items we would have expected about 
64 (33) significant differences, but only 13 
(7) beyond the P level of .01 and 1.3 (.7) 
beyond .001. Since we found 28 (16) dif- 
ferences to be significant beyond .01, 9 (4) 
of them even beyond .001, our results indicate 
instead, that Group II concentrated on a 
limited number of items. 


Summary and Conclusions 


This paper was intended to examine the 
sensitivity of the Humm-Wadsworth Tem- 
perament Scale to the test situation. One 
group of 24 Ss was tried, first in a “clinical” 
situation and then in an “applicant” situa- 
tion. We also compared two groups of 33 
and 35 Ss, respectively, randomly selected 
from a larger group, each of which was tested 
in one of the above situations. The results of 
the two experiments were essentially similar. 
The number of No-responses increased sig- 
nificantly in the “applicant” situation, espe- 
cially when it was made more real in the 
second experiment, and reached about the 
same level (near 200) as in a Swedish group 


Gudmund Smith and Sven Marke 


of 978 job applicants. The median number 
reported by Humm for his samples was 167. 
The increase in No-count implied an increase 
in raw scores for the normal component and 
a decrease for the “pathological” components. 
Even profile values in the stress situation, 
which were corrected for the additional No- 
responses, differed from those in the control 
situation. An analysis of responses to indi- 
vidual items revealed that this inability of 
Humm’s corrections to restore profile values 
to the “control” level was due to the fact that 
an increase in the number of No-responses im- 
plied a change in response patterns vis a vis 
selected components and items. 

Humm and his associates have standardized 
their instrument in samples of people who 
were already employed and had thus nothing 
serious at stake when tested. But they rec- 
ommend the questionnaire for use in situa- 
tions where results have proved to be signifi- 
cantly different. Corrections for No-count, 
established in a nonapplicant sample, are sup- 
posed to compensate for these differences. 
But Humm’s corrections, necessarily, were 
built upon the assumption that when the 
number of No-responses exceeded the norm 
they would be proportionally dispersed over 
the seven components. If this assumption 
had been correct, a frequent choice of “No” 
would result only in a narrowing of the dis- 
tribution of profile values as demonstrated for 
a group of 508 Swedish job applicants. But 
the results presented here suggest, in addi- 
tion, that the cause of a changed response 
pattern in a stress situation was not an in- 
tensification of a general, negativistic bias for 
the word “No” but rather an increased pref 
erence for socially acceptable answers tO a 
number of sensitive questions. Our inevitable 
conclusion must therefore be that test PY” 
files from the H-W scale, even if correcte 
for No-count, often include too many uncer 
tainties to be accepted as indicative of @ per 
son’s temperament.* 

2 There is an approximately linear relationshi 
Humm’s corrective nomogram between high 
count, high raw scores for N and low for H- 
the one hand and profile values on the other- re- 

8 In order to simplify matters, we have not hich 
sented the mathematical formulas upon W aale 


Humm’s corrections were based, only the ration js 
for the very attempt at such corrections. Sinc? n 


jn 
"No; 
on 


r 


The Humm-Wadsworth Temperament Scale 


This conclusion may very well apply also 

other questionnaires where Ss are asked to 
describe their own behavior. The widespread 
use of these instruments can often be justified 
because they are simple and direct and re- 
quire a minimum of experimental prepara- 
tion and theoretical ramifications. After this 
analysis of one of the more widely used in- 
ventories the present authors tend to be 
biased in favor of either more projective tech- 
niques of questioning or of more rigorous ex- 
perimental procedures for personality (and 


Personnel) testing. 
Received September 10, 1957, 


References 


i, Dorcus, R, M. A brief study of the Humm- 
Wadsworth Temperament Scale and the Guil- 
ford-Martin Personnel Inventory in an indus- 


rationale has proved to be rather dubious, any fur- 
er discussion of the formulas would be superfluous. 


233 


trial situation, 
302-307, 

2. Humm, D. G. Personality and adjustment, 7. 
Psychol, 1942, 13, 109-134. 

3. Humm, D. G, & Humm, Kathryn A, Validity of 
the Humm-Wadsworth Temperament Scale: 


J. appl. Psychol, 1944, 28, 


response-bias, J, Psychol., 1944, 18, 55-64, 
4. Humm, D. G, & Humm, Kathryn A. Measures 


Temperament Scale, 
107, 442-449, 


6. Humm, D. G., Storment, R. C., & Iorns, M, E. 
Combination scores for the Humm-Wadsworth 
Temperament Scale. J. Psychol, 1939, 7, 
227-253. 

7. Humm, D. G., & Wadsworth, G. W. The Humm- 
Wadsworth Temperament Scale. Amer, a 
Psychiat., 1935, 92, 163-200, 

8. Tiffin, J. Industrial psychology. New York: 
Prentice-Hall, 1942, 


Applied Psychology 
Journa ot 1958 


The Internal Consistency 


of the Humm-Wadsworth 


Temperament Scale 


Gudmund Smith 


and Sven Marke 


University of Lund, Sweden 


The umm-Wadsworth inventory (3, 4) 
has not been subjected to any analysis of the 
homogeneity of its scales. The main reason 
for this failure to control the internal consist- 
ency of a widely used screening test seems to 
be that methods of scale analysis, at least in 
personality test construction, have rarely been 
applied until the last decade. Traditionally, 
only problems of reliability and validity were 
regarded as crucial for personality question- 
naires. When test instruments are intended 
to measure very general forms of behavior, 
as, e.g., the ability to cooperate or to lead, 
an analysis of internal consistency, naturally, 
may be rather superfluous. Since such þe- 
havior patterns rarely reflect consistent per- 
sonality variables, test instruments con- 
structed for their evaluation must be made 
up of items which are rather loosely connected 
with each other. Two individuals who reply 
differently to a number of questions included 
in such a test need not differ, for instance, 
with respect to their fitness as leaders. Here, 
lack of one-dimensionality does not exclude 
high validity. But problems of internal con- 
sistency ought to be important for those per- 
sonality inventories which claim to measure 


the strength of basic personality variables and 
their mutual interrelation. 


Considerations Regarding the Humm- 
Wadsworth Test 


The H-W test intends to measure seven 
components of temperament described on the 
basis of Rosanoff’s textbook in psychiatry (7). 
There are also a number of subcomponents 
31 altogether. But since these are to be re- 
garded as modified manifestations of the basic 
tendencies, we will concentrate our analysis 
on the seven main components: normal (N) 
hysteroid (H), manic (M), depressive (D), 
autistic (A), paranoid (P), and epileptoid 
(E). Considering the symptomatic character 
of this classification we should expect some 


correlation between items representing the 
same component. This does not mean that 
we demand complete homogeneity of a collec- 
tion of items constituting a component scale: 
the theory allows some variation in symptoms 
indicating a latent personality tendency an! 
its intensity and, therefore, necessitates that 
scales include questions which differ from 
each other to some extent with regard to 
their type and aims. One might perhaps de- 
mand the same degree of internal consistency 
of the seven scales in the H-W test as social 
psychologists demand of their attitude scales. 
Tf the scales are homogeneous, one can con- 
clude, with reasonable certainty, that they are 
also reliable—but the reverse does not hold. 
In any case, the H-W scales should not 
differ considerably among themselves with re- 
spect to internal consistency. Tf that were 
the case, the claims of the H-W test to 
ure the primary components of p 
structure must be invalid since, proceeding 
from Rosanoff’s theory or any other theo 
of the same kind, one can hardly presume 
some components would be represente ae 
consistent symptomatic patterns than othe 
Big differences in internal consistency ae 
the scales will obviously mean that pi 
scales measure mutually more connet 
symptoms than others and that the 
probably, have been based upon supe 
vague, and ambiguous definitions of 
ponents, definitions which partly, in Lee 
of the H-W test, depend on the choi? 
clinical groups for the validation 


6) 


J 
The Statistical Basis for the scale Analy 


Our scale analysis was based upon 4 m 
originally developed by Likert (5) prom 
study we used the following procedures yedi? 
a group of 508 job applicants ab iv? 
companies, including all male SS 
certified Humm-testers with the ? 
version of the 1954-55 revision 9 


234 


Consistency of the Humm-Wadsworth Temperament Scale 


test, we selected the upper and lower quar- 
tiles with respect to the profile values for each 
Component, i.e., the raw scores corrected for 
No-count and transformed into a 21-point 
Scale. Since an exact 25% selection was im- 
Possible with a discreet variation, we tried to 
Set a total selection as close to 50% as pos- 
sible (Table 1). The next step in our scale 
analysis implied that we computed the aver- 
age value for each item in the two extreme 
8roups. The choice alternative indicating the 
component was counted as + 1 and the other 
alternative as 0. If we subtracted the aver- 
age value for an item in the low group (Mr) 
from its average value in the high group 
(My), we got the value of DP (the discrimi- 
natory power), or, an estimation of how 
Strongly an individual item correlated with 
the entire scale, 

A DP value is naturally not the same as a 
Conventional coefficient of correlation. But 
Wwe know, on the other hand, that it can be 
regarded as a good estimation of that coeff- 
cient (9). Moreover, we have to point out 
Some weaknesses in the Likert method. The 
®ne-dimensionality of a scale can be measured 
reliably only by a factor analysis. A DP 
Value, as a matter of fact, implies nothing 
a. an estimation of the first centroid factor. 
an Tegression is practically linear (7 > .80 
seco ding to Ekdahl [2]; cf. also [6]). Con- 
inqently; a scale with high DP values may 
fic’ items which are mutually uncorre- 
Of fay In view of the practical impossibility 
coe tot analyzing this material, however, we 

cided to be content with a Likert analysis. 

'S does not imply an unfair treatment of 


Table 1 
The Number of Ss in the High Group and 


~ o Group for Each Component 


Component 


aa N oH M DA Pee 
A 
Sh-group 29.6 27.8 32.5 19.2 22.0 17.9 23.6 
% low. 149 140 164 97 111 90 a 
VP 24.6 19.0 19.6 26.4 26.0 29.8 27. 
124 96 99 133 131 150 140 
> 
% 54.2 46.8 52.1 45.6 48.0 47.7 514 


235 


the H-W instrument in the sense that the 
yardstick used for our criticism is too heavy. 
Instead, the demands of the Likert method 
on the internal consistency of the scales are 
rather too small; and it will, therefore, ex- 
onerate the scales from a number of existing 
inconsistencies which it cannot lay bare. 

Only when one response alternative has 
been chosen much more often than the other, 
the DP values may erroneously become too 
low. For this Ekdahl has proposed a correc- 
tion. We arrived at the original DP values 
by means of the following equation: 


DP = My — Mr. 


But we computed, in addition, the best dif- 
ference possible with the choice distributions 
given in the entire group. Let us assume that 
the high group represents X% of the total 
group, and that Y% of it chose the alterna- 
tive indicating the component. In case Y is 
bigger than X, the best possible difference 
will be + 1.0. But if Y is less than X this 
figure will diminish at the same time as the 
difference between Y and X increases. The 
reverse reasoning applies to the low group: if 
the number of individuals selected for this 
group is bigger than the number of nonindica- 
tive choices, the best possible value will ex- 
ceed 0. We use mg to stand for the best pos- 
sible value in the high group and my, for the 
corresponding value in the low group. The 
corrected DP value was then computed in the 


following way. 


If the number of Ss included in the ex- 
treme groups is less than the number of 
choices of both alternatives, the denominator 
will naturally be + 1.0. But in case distri- 
butions are so skew that the number of high 
group or low group Ss exceeds the number of 
choices of either the indicating alternative or 
the nonindicating one, the denominator will 
be less than + 1 (because mr diminishes or 
my, increases). And DP, will increase in re- 
lation to DP. Thus a DP value for an item 
where choice distributions are skew can be 
improved by our correction. We have to un- 
derline, however, that even DP; values should 


` 


Desetee esere 


ù HM D A P CE 


Fic. 1. DP: values for the seven components in the 
Humm-Wadsworth Temperament Scale. 


not be generalized to stand for a sample with 
a lower No-count than this one (the number 
of No-responses was close to 200). It is also 
important to remember that DP, values for 
items where very few Ss have chosen one of 
the response alternatives are naturally very 
unreliable. 


Results of the Scale Analysis 


Figure 1 summarizes the first results for 
individual items in the component scales. 
There are two broken lines in the diagram, 
one for DP, values of .20 and one for values 
of .40. The first line is adapted to what 
Likert (5) considered to be the lower limit 
for DP values in a reasonably homogeneous 
scale (.75 for a five-point scale) and the 
second line to the more rigorous demands 
made by Ekdahl after his analysis of related 
problems (0p. cit.). It is only too evident 
that the scales differ from each other with re- 
spect to internal consistency. While M and 
A may be considered fairly acceptable if 
Likert’s more lenient norms are applied, N 
appears to be very heterogeneous. 


Gudmund Smith and Sven Marke 


We also tested statistically the differences 
between the scales. A includes very high DP 
values as well as very low ones. The left part 
of Fig. 2 indicates that this scale differs from 
all the others scales with respect to the devia- 
tion of these values. As far as concerns the 
average DP; values for the scales, N has a 
lower value than the “pathological” compo- 
nents. On the right part of Fig. 2, we can 
read that M differs from E, P, H, N; A and 
D from H and N; and E, P, H from N 
These obvious differences, as we have said al- 
ready, imply a serious memento for the com- 
ponent theory, which presupposes that the 
scales represent logically and psychologically 
comparable components. We know that the 
final inclusion of items in the seven precon- 
ceived scales was motivated exclusively by 
the results of a validation study (3, 4). The 
homogeneity of a scale thus depended on how 
the diagonisticians conceived and applied the 
component descriptions. Since the Rosanoff- 
Humm descriptions correspond in part to rela- 
tively unitary psychiatric syndromes (autistic, 
manic, depressive) and in part to mixtures 0 
several principles of classification (paranole, 
epileptoid, hysteroid, normal) we were not 
surprised to learn that the internal consist- 
ency of the scales varied considerably- 


The Normal Component 


N is the most inconsistent and theoretically 
least acceptable of the scales. There are 


i 


omp 
x- x 


P xx 
H xx 
Nox 
Myx 


M 
A 
D 
E 
P 


lL 


x 
| 
x 
Hax 
NX xX 


— 
———— 
— 

za 


Deviations eans 
(variance ratios) (te values) 
xxx P< .00 
xx Pe .01 
x Pe .05 P 
Jes ” 
Fic. 2. Differences between the component ae 


n 5- 
the deviation of DP: values and their mea? 


N 
a 
'alqenaun) q 
"h “PU ves ‘sway ayy jo Bupsom Ə? 10 v 
967 8h ‘897 ‘Zz ez 88 ‘zog ZOE ‘661 o> aa 
PEZ ‘Te £4 ‘SST ‘O8 pI 
X 6E ‘661 ‘S ‘8S ‘OTZ “26T ‘TPL ‘g6z ‘gz 
& GPZ ‘BI ‘PST 78 “69T ‘SOZ ‘POZ GET “IST ‘Ty oll “PLI ‘OST ‘giz “PEZ ‘EZ “EST 00-01 *aq 
N FT ‘IST ‘ozz 
8 “L6 ‘SZZ ‘967 
N ‘EIZ ‘EBZ ‘F6 ‘gg 
À £67 ‘941 rágd +0£ 66 ‘SOT “OLT ‘OSZ ‘66 ‘BT 
N “102 ‘L ‘9 ‘Le ‘ST ‘66T 097 ‘6p ‘LET ‘TOE “£ ‘487 ‘SEL ‘Le Z ‘T87 ‘19 ‘ze 
& “ISI ‘OLI ‘IZZ RA ‘Sb ‘E4 ‘OST ‘Lb “HOT ‘69 9ST ‘EOF ‘POT ‘EZT ‘EST “TTZ ‘OSZ “607 W-0v aaa 
RN 
$ o1z 16z ‘ZIZ ‘961 
È ‘TE ‘THI ‘ObZ Lou IST ‘ez LET ‘soz “TT ‘BET ‘66 ‘OL uz 
3 ‘s8z ‘OST ‘601 ‘coe ‘og ‘Sez ‘LS “697 “OZ ‘EGT ‘6 ‘se T4 ‘6ST ‘OE “SLT PIZ ‘9 “2 ‘18% 
= ‘SIT «61 ‘S8 ‘967 ‘49 ‘OOT “Te ‘SOT ‘ZT ‘OPZ ‘46 ‘8 ‘Che “bE ‘LET ‘TIT ‘262 ‘oxz ‘osz ‘soz ‘esz i-r aga 
N 
§ on 
S ‘L97 ‘OE ‘OL 01 PTT ‘P87 
5 867 ‘Le ‘Ole ¥6 ‘S8z ‘EGZ ‘PIE ‘SZZ ‘OET ‘Lh “PIE ‘BEZ ‘42 ‘ELZ 
Ss “LIZ ‘OST ‘ezi ‘TI ‘Z6 ‘66 IS ‘log ‘TIT ‘927 ‘ex “PZI ‘L97 ‘LS “POL “#8 ‘Sor 
> ‘691 a‘0Ez ‘SPI ‘08 ‘8S ‘ST ‘BEZ ‘oog ‘OF ‘60E ‘oz “SST ‘or ‘S6 ‘OST ‘esz ‘spz +1 e-o aaa 
sy toT “LOT 
Š "S61 ‘ZL ‘97I 
3 “PSZ “ELIZ SST TII ‘69 ‘oLz 
= SOL ‘ZIZ PZI ‘LLT ‘EST ‘LPT “481 ‘Soe FOZ “FOE ‘L81 ‘OF 
© “CST q'OZI FE ‘LbT ‘SPZ “LUT ‘LST “LST ‘907 ‘SLZ ‘OIZ ‘soe ‘g LIE ‘S4 ‘cor a647 o< aaa 
ploydapidsy Plourivg onsnny aAtssardaq Jue proa3}s4H Įpuuoy yuəuoduwop 


vS] [ENpIAIPUT 0} aouasaJoy YIN SISÁļeuy ƏV ay) Jo SIMSA ay} Jo Lewung y 
T AWL 


238 Gudmund Smith 


reliable DPx values above .40, and above .20 
only about 25% of them. Among items 
(Table 2) with high DPx values one may 
trace some correspondence in content. Many 
indicative choice alternatives seem to concern 
a good measure of self-reliance (255, 256, 
287), an emotional balance going with it 
(287, 109), and lack especially of paranoid 
defense mechanisms (279, 14, 255, 109). But 
several of the worst items aim at the same 
vague behavior complex (80, 155, 199, 302). 
One reason for the lack of homogeneity in 
the N scale seems to be that many questions 
are formulated in such an ambiguous manner 
that S is forced to choose his answer more or 
less at random; another reason is a rather 
transparent moralistic bias in some of the 
items which is apt to make an S’s choice in- 
consistent with responses to more neutral 
questions; a third reason may be the mixture 
of normal tendencies and more or less com- 
pulsive ones (32, 280, 94, 184, etc.). 


The Hysteroid Component 


_ This scale also belongs to the inconsistent 
ones. Only 4 DPy values are higher than .40. 
Most of them, however, exceed .20. The 
worst items in H, as a rule, have very little 
in common with hysteroid symptoms but 
rather with autistic (88) and paranoid ones 
(302, 42), with sthenic working habits (197), 
or, with general norms of ethics which may 
differ markedly even between seriously “ethi- 
cal” individuals (234, 226). The best items 
in the scale aim at many typically hysteroid 
traits: an inclination to exhibitionist behavior 
(317, 245) and gambling (75, 273), a lack of 
objectivity and permanent values (186, 84, 
104, 284) together with a basic indifference 
and cynicism vis à vis other people (195, 165, 
27, 258). Several of them are projective (27, 
258, 214, 94, etc.). But the choice of indica- 
tive alternatives to these questions need not 
imply a hysteroid character. A scrupulously 
honest person will certainly affirm questions 
84 and 297 (instances of bad conduct and 
nonattendance in school), and a naively rough 
person 317 and 245 (practical jokes and so- 
ciability). Since the self-image of hysteroid 
people tends to be characterized by consistent 


and Sven Marke 


falsification one can hardly expect that they, 
among all people, will choose the indicative 
alternatives. 


The Manic Component 


There are only 3 DPx 
this relatively acceptable scale. Some of the 
most inconsistent items belong to @ subscale 
intended to measure the degree of euforia. 
But this applies only to item 112 while other 
questions (267, 47, 156) refer to irritability 
and the rest of them (71, 119) to emotional 
adaptation and identification. On the whole, 
however, M consists of items aiming at cyclic 
emotionality and needs of contact. But since 
M and D share a great number of items We 
may ask why they have not been conjoined 
into one component. A preliminary analysis 
of the DP-values in M as computed on the 
basis of extreme groups in D suggested that 
the two scales were highly interrelated, as are 
the manic and depressive symptoms in psy- 
chopathology. 


values below .20 in 


The Depressive Component 


Even D belongs to the acceptable scales. 
Among items with low DP, values we fin 
several without specific relations to depressive 
tendencies: shyness (3), impatience (304, 
139), aggression toward authorities (151), 
refusal to play in a new game (23), 0t ‘Do 
certain animals make you nervous?” (35). 


The Autistic Component 


Here we find DPy values of all magnitudes, 
most of them high enough to be acceptable: 
Two of the least acceptable groups of items 
were summarized under such seemingly rele- 
vant headings as feelings of inferiority ® 

narrow interests. But the choice of “yes bY 
39 (more intense feelings than in other Peo” 
ple) need not be a sign of inferiority feelings» 
and a choice of “yes” to 23 (see above or fe 
151 (an inclination to oppose overbearing P? d 
ple) not a sign of autism. And items 24 aa 
169 (difficulties in relaxing and a tendency f 
get stuck in details) would probably be mon 
relevant in a scale measuring compu sive i 
asthenic symptoms. Items on the top SEET 
scale of DP, values refer to more 4 equ 


Consistency of the Humm-Wadsworth Temperament Scale 


symptoms such as shyness, timidity, contact 


difficulties, etc. 


The Paranoid Component 


While most items fall above the -20 line, 
only three of them exceed .40, Most of the 
best items refer to paranoid characteristics as 
we Usually understand them: jealousness, sus- 
Piciousness, rigidity, contempt for the opinion 
of others, Projective formulations are not 
uncommon (24, 58, 148, 94, etc.). Toward 
the bottom of the scale, however, the unre- 
latedness of items becomes obvious. In what 
Way do scurrying work habits refer to para- 
noic tendencies (48), or an S’s inclination to 
limit his human contacts (72). Some items 
seem to have been hampered by evaluating 
formulations (202, 226). Another plausible 
cause of the inconsistent response patterns of 
Our Ss is the linking of aggression and self- 
assertion on the one hand with projective de- 
fense mechanisms on the other. The paranoid 
Pattern may very well be passive and anx- 
lously submissive; and aggression, for that 
Matter, is typical of many other neurotic com- 
Plexes, 


The Epileptoid Component 


Close to one-third of all items got unac- 
totable DP, values. Those items with rela- 
‘ely high DP, values (we exclude a number 
neurologica] questions where choice distri- 
in tons are extremely skew) have very little 
toy mon except a certain prosaic attitude 
ced life. The central symptoms of the 
™ponent have partly disappeared in the 
Pose. inspirational fixation to pur- 
S, meticulous precision in their execu- 
lene An important reason why the epi- 
Mmi ip Component is not more homogeneous 
the’ 3 be that Rosanoff relied too heavily on 
epil assica] theory, now abandoned (1), that 
“Ptics Constitute a special group even tem- 
hoje Mealy , a theory which guided the 
Hono. Of the epileptoid group in the valida- 
Study (3, 4), 


Summary and Conclusions 


Ay . 
Bister® Paper is a study of the internal con- 
CY of the seven scales in the Humm- 


239 


Wadsworth inventory carried out ina group 
of 508 male applicants for industrial work by 
means of a revised Likert analysis. Our re- 
sults show beyond doubt that the H-W test 
as a whole hardly fulfills even quite lenient 
demands for one-dimensionality. The manic, 
autistic, and depressive components tend to 
hang together, at least in parts, and might be 
improved if a number of items were refor- 
mulated or taken out of the scales. But those 
groups of questions which are supposed to 
measure the four remaining components can 
hardly be designated as scales. 

Some inconsistencies were perhaps stressed 
by our choice of sample. If the number of 
No-responses had been as low as in Humm’s 
samples (median around 167) it is possible 
that some DP values would increase. But we 
want to point out that all DP values were 
corrected for such a skewness in the distribu- 
tion of responses which has proved to be one 
consequence of a high No-count (8). There 
was, however, no way of curing those incon- 
sistencies in the choice of alternatives which 
were likely to appear in a group of job ap- 
plicants, i.e., a group where Ss, as we have 
shown in another study (8), tried to compose 
a more or less desirable image of themselves. 
But our task was not to study the H-W test 
under those optimal conditions where it was 
standardized but under those conditions for 
which it has been recommended—as a screen- 
ing tool. 

Moreover, our results indicate that the lack 
of internal consistency found in our study is 
hardly due to the choice of Ss but rather to 
the vague or ambiguous definitions of the 
components; the clumsy manner in which sey- 
eral items, otherwise often acceptable, were 
formulated; a frequently recurring moralistic 
bias distorting the neutral, exploratory aim of 
the inventory; and, last of all, the uncritical 
empiricism which guided the choice of items 
for the scales and allowed no further revisions. 


Received September 10, 1957. 


References 


öm, C. H. A study of epilepsy in its clini- 
1. Ae , social and genetic aspects. Acta psychiat, 


Kbh., 1950, Suppl. 63. 


240 


2. Ekdahl, A. Attitude and environment. Unpub- 
lished manuscript. Lund Univer., 1953. 

3. Humm, D. G., & Humm, Kathryn A. Validity of 
the Humm-Wadsworth Temperament Scale: 
with consideration of the effects of subjects’ 
response-bias. J. Psychol., 1944, 18, 55-64. 

4. Humm, D. G., & Wadsworth, G. W. The Humm- 
Wadsworth Temperament Scale. Amer. J. 
Psychiat., 1935, 92, 163-200. 

5. Murphy, G., & Likert, R. Public opinion and the 
individual. New York: Harper, 1938. 


Gudmund Smith and Sven Marke 


6. Richardson, M. W. Notes on the rationale of 
item analysis. Psykometrika, 1936, 1, 69-76. 

7. Rosanoff, A. J. Manual of psychiatry and men- 
tal hygiene. (7th ed.) New York: Wiley, 
1938. 

8. Smith, G., & Marke, S. The influence on the re- 
sults of a conventional personality inventory 
by changes in the test situation. J. appl. Psy- 
chol., 1958, 42, 227-233. 

9. Weichelt, J. A. A first-order method for esti- 
mating correlation coefficients. Psykometrika, 
1946, 11, 215-221. 


Ipga 


4 ournal of A 


bplied P. 
‘ol. 42, No. 7 fri iad 


sequent Tests 


VA Hospital, Ft. 


The Strong VIB has established a firm 
Place for itself in the assessment of vocational 
interests, Moreover, it has been found a 
Useful source of information about additional 
dimensions of personality when subjected to 
Clinical analysis in counseling and placement 
Situations (6). It was felt that this informa- 
tion might be more systematically and objec- 
tively exploited by the development of new 
SCales for the Strong. 

, Evel of adjustment, of general importance 
Be the Counseling or placement of an indi- 
vidual, was considered a dimension whose 
Measurement by the Strong would be desir- 

©. In the phenomena of neurosis, theorists 
ave given anxiety a central position, and em- 
hirically there is difficulty in differentiating 
+ usured “anxiety” from measured “neurot- 
— (12, 13, 24). Since the criterion 
eacasures used for the development of the 
Ta reported upon here are considered meas- 
Tate of anxiety and are significantly corre- 
7 a with other measures of anxiety (3, 4, 
lab +, 19, 20, 26), the present scale has been 
so anxiety, However, it is expected that 
te Correlation with measures of “maladjust- 
dle or “neuroticism,” etc., oF variables un- 
hi ce these complex phenomena will be as 
se as its correlation with other anxiety 
fasures, 


Development of the Scale 
nep oth the theoretical relationship between 
rela, ticism and anxiety and a reported cor- 
pron of .74 (13) between the Taylor MAS 
and the Winne Scale of Neuroticism 


Garman in a dis- 
anes of E. Lowell 


m 


inp af 
Serta tis Scale was developed by 


Key, OP carried out under the gui 
Muire Submitted in partial fulfillment of We E 
at th S for the degree of Doctor of Philos: 


niversity of Michigan. 
7 i 241 


An Anxiety Scale for the Strong Vocational Interest In- 
ventory: Development, Cross-Validation, and Sub- 


of Validity * 


Glen D. Garman 


Douglas, Utah 


and Leonard Uhr 


University of Michigan 


(27) led to the decision to use a combination 
of these two MMPI scales as a criterion meas- 
ure. The eight items in common? to these 
two scales were represented only once in the 
criterion measure. The correlation between 
these two scales for the graduate psychology 
group used in the present study was .65 with 
the item overlap, .54 when common items 
were divided between the two scales. The 
former figure is closer to coefficients quoted in 
more recent studies (11, 16, 26) than it is to 
the .74 originally quoted by Holtzman, Calvin, 
and Bitterman (13). 

The Strong and MMPI answer sheets were 
available for approximately 400 Ss who had 
participated in an assessment study on the 
prediction of performance in clinical psychol- 
ogy reported by Kelly and Fiske (15). Most 
of the Ss were first-year graduate students in 
psychology. This pool of Ss was randomly 
divided and the first half was used for de- 
velopment of anxiety indices for the Strong. 
The second half was then used for cross- 
validation of these indices. In addition, in 
order to test the indices on a more hetero- 
geneous sample, Strong and MMPI answer 
sheets were obtained for 200 male entering 
freshmen from the University of Minnesota.® 

As might be expected, the distribution of 
Taylor-Winne scores had a high positive skew 
for both college groups. With a possible 
score range of O to 72, the mean and stand- 


2The eight items, as numbered in the MMPI 
booklet are: 43, 107, 186, 190, 191, 238, 242, 263. 
Other workers have referred to six items common 
to the MAS and the Winne. This may have bsn 
due to a confusion of the item numbers in Taylor’s 
“Biographical Inventory” and their numbers in the 

PI booklet. 

a as data were obtained through the courtesy 
of Ralph F. Berdie and the Student Counseling Bu- 
reau of the University of Minnesota. 


242 


ard deviation for the graduate group were 8.2 
and 5.8, respectively. The highest score for 
this group was 35. For the freshman group, 
the mean was 14.3, the standard deviation, 
87. ee. 

A variety of approaches was utilized in ex- 
ploring the Strong for anxiety indices, includ- 
ing empirical methods and methods based on 
a priori rationales, analysis of individual items 
and pattern analysis. The conventional item 
analysis method, utilizing high and low cri- 
terion Taylor-Winne groups, yielded the best 
measure. The resulting scale consists of 46 
responses to 33 items.* 

The possible score range for the new Anx- 
iety Scale is from — 22 to +24. For the 
freshman group, which probably constitutes 
the most appropriate reference group, the 
mean Strong anxiety score was — 4, the 
standard deviation, 5. The figures for the 
graduate group were very close to these. The 
split-half reliability of this scale, calculated 
for the cross-validation group of graduate 
students was .73. 

When Strong Anxiety scores of the graduate 
cross-validation group were correlated with 
criterion scores, a coefficient of .36 was ob- 
tained. Correction for attenuation due to 
unreliability of the scales raises this coefficient 
to .44. A validity coefficient of 42 with the 
Taylor-Winne score on the MMPI was ob- 
tained for the cross-validation group of col- 


lege freshmen, Correction for attenuation 
raises this to .51, 


Correlations with MMPI Scales 

In addition to its r 
terion variable, som 
Anxiety Scale and o 


elationship to the cri- 
e relationships between 
ther measures can be re- 
ported here. The MMPI Scores of the fresh- 
man group were available and these corre- 
lated with the Strong Anxiety Scale as shown 
in Table 1. It should be remembered that 
the Anxiety Scale was developed with MMPI 
items as the criterion, 

It is perhaps not surprising that the highest 
correlation should be 


i c with the Psychasthenia 
Scale, since high enough correlations between 


4 The items composin 
weights, are available 


g the scale, with their sc 
logical Services, 


from the Bureau 


oring 
University of Michigan. 


of Psycho- 


Glen D. Garman and Leonard Uhr 


Table 1 


i d 
Correlations Between the Strong Anxiety Scale an 
MMPI Scores for 200 College Freshmen 


MMPI Scale j 
26** 
m ; 
L = .25** 
= 20" 
Hs aar 
D 33** 
EE 
B ; 
Pa is" 
Mi -19*" 
a 15* 
Pt 42** 
* 33** 
Ma a0 
a 37* 


* Significant at the 5% level. 
** Significant at the 1% level. 


e 
the MAS and the Psychasthenia Scale have 
been reported (2, 9) to make it appea thing. 
they are measuring essentially the same ships 
As to the meaning which these relation a 
to MMPI scales lend to the Strong ae ó 
Scale, one might postulate the sensi Haa 
this scale to a general constellation of: pi est 
ness of and attention to internal Pe hee 
both psychic and somatic, particularly ane 
these are unpleasant or disturbing; a ten emi 
to be aware of external presses and the es in 
tional impacts of interpersonal process n ad- 
general; and a tendency to be frank 1 
mitting or reporting these phenomena. 


Discussion 


jon- 
One crude attempt to place the r e ‘col 
ships demonstrated here in a theoreti Siffer 
text might be somewhat as follows: tent t0 
ences may exist in people to the ex pjects 
which they are sensitive to external ab i 
or events and to the internal represen Als? 
of the impact of these objects or events: viroD- 
the internal response to the external cesse 
ment may be via thought and feeling P” amp! 
and/or via somatic reactions. (For e* cho” 
whereas one person may respond to pe ae 


an 
logical threat with a feeling of fear OF his 
lety, another may become aware cales 
stomach is “upset.” Our “anxiety 


Strong Vocational Interest Inventory 


have items that represent both.) To go one 
step further, given sensitivity or responsive- 
ness to external phenomena, one’s internal re- 
sponse may be a pleasant or an unpleasant 
one. If the internal response is unpleasant 
and one is aware of it, he can find items in 
an “anxiety” scale which will allow him to 
communicate this. Also, if his internal ex- 
Periences have been more unpleasant than 
Pleasant, he may come to fear them and be 
fearful, shy, and suspicious in interpersonal 
telationships (paranoia), may avoid rather 
than seek new experiences (lack of adven- 
turousness), refrain from pushing the further- 
ance of his own aims or desires in interper- 
Sonal contexts (lack of dominance and en- 
durance), and wish that people would be 
kinder to him and act in such a way as to 
Produce more pleasant internal experiences 
for him (need for succorance). 

Tt may be noted that the MAS, which 
Served as part of the criterion measure for 
the Strong Anxiety Scale, was developed for 
Use in tests of drive theory. It was assumed 
that MAS scores are related to “emotional 
responsiveness,” which, in turn, contributes to 
drive.” Predictions on the basis of these 
assumptions have been confirmed in a num- 
er of studies (23). 


Further Studies Employing the Anxiety Scale 


In the course of a continuing study of se- 
ection and evaluation of medical school stu- 
dents being conducted at the University of 

ichigan, a good deal of interesting infor- 
mation has been amassed about the Gar- 
Man Anxiety Scale. Garman’s Scale seemed 
Promising enough to be used for those Ss in 
bls Present medical study for which Strong 
a nks were already available. It gave an 
anxiety score” at almost no cost, and helped 
AP a personality area that had already been 
‘Monstrated to be related to aspects 0 
medica] school performance by Eron 10) 
Nd Shoemaker and Rohner (18)- 


Methodology 
on site groups of Strong blan 
3 he Garman Anxiety Scale. 
ee medical school senior 

952) and the entire group © 


ks were scored 
First, in 1952, 
class (Class 


op f applicants 


243 


(including those who were to make up the 
Class of 1956) were given the Strong. Second, 
as part of a national study conducted by the 
Association of American Medical Colleges, the 
Strong was readministered to the Class of 
1956, during their senior year. Thus we have 
Garman Anxiety scores for the Class of 1952 
seniors, and for the Class of 1956 as both ap- 
plicants and seniors, with an intervening pe- 
riod of four years between test and retest. 

The data for the Class of 1956 allows us 
to look at the test-retest reliability of the 
Anxiety Scale over a period of four years. 
The correlations between the Anxiety scores 
and all the other Strong scores, available for 
almost all keys at all three administrations, 
help us to set the Anxiety Scale into the con- 
text of other Strong scores and provide sug- 
gestive leads as to anxiety components of oc- 
cupational interests. (Since these scores come 
from the same pool of questions, there is 
built-in correlation to the extent that the 
same response may contribute to a score on 
several scales. However, this effect should be 
small—the Anxiety Scale is scored on only 33 
of the 400 Strong items.) Finally, a large 
number of additional variables have been cor- 
related with the Anxiety scores, for purposes 
of the larger assessment study (for a pre- 
liminary report, see [25]). From these, the 
present report will cull only the most interest- 
ing relationships. These are of two types: 
(a) correlations between Garman Anxiety 
scores and Anxiety-related scores measured 
by other instruments (Cattell’s 16 P.F., Ed- 
ward’s Personal Preference Blank, Allport- 
Vernon-Lindzey’s Study of Values, and Mc- 
Quitty’s Health Questionnaire), (b) correla- 
tions between Anxiety, measured as a part of 
the admissions procedure before students were 
accepted, and performance in medical school 
during the subsequent four years. 

The three Anxiety Scale administrations 
will be designated: (c) 1956-Seniors (the 
Class of 1956 tested as seniors), (b) 1956- 
Applicants (the Class of 1956 tested as ap- 
plicants) , and (c) 1952-Seniors (the Class of 
1952 tested as seniors). For the Class of 
1956, N = 112; for the Class of 1952, N 
—116. The complete matrices from which 
the following data have been abstracted are 


244 


available at the Bureau of Psychological 
Services, University of Michigan, Ann Arbor. 
The findings here reported for the Class of 
1956 are for all members of the senior class 
in attendance at the time of test administra- 
tion except women and Negroes, who were 
eliminated from the sample to increase homo- 
geneity. Data were incomplete for some stu- 
dents, because of the medical school’s stag- 
gered schedule, but the one-fourth of the 
Class of 1956 that was not in attendance at 
the time of readministration of the Strong 
was an unselected group, so that no bias in 
sampling should have been introduced. 


Results 


The four-year test-retest correlation of the 
Garman Anxiety Scale was .51. When it is 
remembered that the two administrations 
were separated not only by four years, but 
also by very different sets toward taking the 
test (in 1952 the students were applicants 
actively seeking admission to medical school, 
in 1956 they were successful seniors partici- 
pating in a research program), this relatively 
long-term reliability compares favorably with 


Table 2 


Correlations Between the Garman Anxiety Scale and 
Selected Strong Occupational Interests 


Garman Anxiety 


TNS SN 


Grad. 1952 1956 


: Psychol. Medical Medical 
Occupational Interest Students Seniors Seniors 
Artist 54 = 46 
Mathematician "s 34 29 
Chemist 25 29 * 
Production Manager —32 —29 —36 
Personnel Director —28 15 —38 
Musician 32 33 32 
Accountant —35 —40 -37 
Office Man if —43 —23 
Banker x —38 —11 
Sales Manager ~39 —41 —30 
Author-Journalist 50 40 39 
Interest Maturity ia —23 —35 
Masculinity-Femininity —22 -—35 —21 


Note.—Approximately 40% of the correlati 3 
Apsicty peale and the Strong scales were sienta oane en the 
fo level (r =. or the sample of Psych y 4 
r = .25 for the samples of Medical studento 2 Students; 
* This correlation is not available, 


Glen D. Garman and Leonard Uhr 


Table 3 


Correlations Between the Garman Anxiety Scale and 
Selected Personality Variables Measured by 
the 16 P.F., the Personal Preference 
Blank, the Study of Values, and 
the Health Questionnaire 


Garman 
Anxiety 
Personality Variable (1956 Seniors) 
16 P.F. Factors 
* 
Emotional Stability See 
Adventurousness aa 
Paranoia oer 
Hysteric Unconcern nae 
Anxious Insecurity 43 
Personal Preference Blank Needs a 
* 
Succorance or 
Dominance =a 
Endurance —24 
Study of Values ‘ 
Economic 3 D 
Aesthetic 40 
e+ 
McQuitty Health Questionnaire 40 


* Significant at the 5% level. 
** Significant at the 1% level. 


reliabilities reported for similar personality 
questionnaires, 

Table 2 presents correlations, obtained ne 
by Garman on his original cross-validation 
sample, and by Uhr and Kelly on the we i 
groups of medical students, between the am 
iety Scale and selected Strong Scales. _ nt 

We thus find a large number of significhi 
correlations, with good agreement in pee 
cases between the results on all three ae 
ples. Anxiety as measured by the Lar ae 
Scale is positively related to Artistic in ur- 
ests, exemplified by Artist and Author-J? by 
nalist; and Scientific interests, exemplific. ely 
Mathematician and Chemist. It is negativ g- 
related to Business and Sales interests, ker, 
emplified by Production Manager, Bank®: 
and Sales Manager. Further, Anxiety iy 
negatively related to two of the perce at 
variables measured by the Strong: Inte 
Maturity and Masculinity-Femininity- ela- 

Table 3 presents the significant om d 
tions between the Garman Anxiety Scale 


tell 
related personality variables from the Cat 


re: 


Strong Vocational Interest Inventory 


16 P.F. (5), the Edwards Personal Preference 
Blank (8), the Allport-Vernon-Lindzey Study 
of Values (1), and the McQuitty Health 
Questionnaire (17). We find a number of 
hrs variables significantly related to 
ap ei Scale. Further, the correlations 
the ie Ansiety-related factors measured by 
A P.F. and with the McQuitty argue for 
oeat amount of construct validity of the 
arman Anxiety Scale. 
E be the course of the total medical school 
Sch, Y, five independent factors of Medical 
T MA achievement were identified, measured, 
dict correlated with a large number of pre- 
ine variables, including the Anxiety Scale 
previ on the Strong completed four years 
For ious to the measurement of the criteria. 
ed es of these factors, “Medical Knowl- 
are the Anxiety Scale predicted to a sig- 
a extent (r = .20, significant at the 5% 
recip This is the sort of low but significant 
a ne to be expected from previous 
Haike but two considerations make it of 
"neal Unusual interest. First, this is a pre- 
R 52, relationship over four years. Second, 
iety variables measured on the Strong, Anx- 
cal me the only one that predicted “Medi- 
Nowledge” to a significant degree- 


Implications 


in ae data reported here provide additional 
Cale ne regarding the Garman Anxiety 
Satisfa or the Strong. We have indications of 
Very ay long-term reliability (despite the 
Wo t ifferent situations at the time of the 
that estings), We have evidence of validity 
Stands up well in comparison with types 
pentodes of validity reported for com- 
ea We find a number of extremely 
Ocati ive intercorrelations with a number of 

pea interest and personality variables. 
teen T we find a significant low relation 
terion f, Anxiety as a predictor and a cri- 

e ingot of medical school performance. 
am, dications of correlates of Angiety from 
varia Personality, interest, and performance 
{“obe es bring us to questions peyond the 
interest the present report, but they are of 
is eek to us here because of their validating 
Mg so the Garman Anxiety Scale is measur” 
Mething that is related to the some 


Para) 


245 


things being measured by other psychological 
instruments. 

Possibly of more importance than the data 
so far collected on the Garman Anxiety Scale 
itself is the support it gives for this sort of 
interrelating of our numerous psychological 
instruments. These results would seem to 
lead to three guiding conclusions for possible 
extensions of work in this area. First, em- 
pirically derived scales permit making more 
measurements with the same investment of 
testing time, thus contributing importantly in 
combatting the often crippling limitations on 
psychological research imposed by time and 
expense. Second, interrelations between the 
numerous instruments we have at our dis- 
posal will help us know whether we are really 
measuring something new. And third, we 
might entertain the interesting possibility— 
one that would seem desirable for some fu- 
ture era of psychology when we are able to 
systematize our items and tailor the appro- 
measuring instrument for each situa- 
tion—of a pool of questions adequately sam- 
pling all aspects of behavior rather than a 
proliferation of question forms developed to 
answer specific questions with no knowledge 
of the larger context of behavior. Measur- 
ing instruments for important variables could 
then be built up by proper weighting of the 
related items in the pool. Variables that 
could not be measured by the available pool 
of items would be used to enlarge the pool, 
i.e., items associated with the new variables 
would have been demonstrated to be useful 
for the psychologist and would have earned 


their place in his repertoire of tools. 


priate 


Summary 


An Anxiety Scale for = Strong VIB has 
Joped by item analysis, using a com- 

been Ser Tayi MAS and Winne Scale of 
Neuroticism items on the MMPI as the cri- 
terion measure. Four hundred graduate psy- 
chology students, randomly divided into two 
equal groups, were used as Ss. Scale items 
were chosen by means of an analysis of the 
first group and the Anxiety Scale validated 
on the second group. A test of the Anxiety 
Scale on a more heterogeneous sample was 


246 


made in a second validation using 200 male 
n as Ss. 
ayia Scale consists of 33 items on 
which 46 responses are scored. Split-half Te- 
liability calculated for the first cross-valida- 
tion group was .73. The two cross-validations 
gave correlations with the criterion of .36 and 
42, which are raised by correction for at- 
tenuation due to unreliability of the scales to 
44 and .51. 

A number of significant correlations are re- 
ported for the cross-validation sample be- 
tween the Anxiety Scale, and MMPI and 
Strong VIB Scales. Data collected in a 
Medical School Assessment Project produced 
a number of interesting correlates of the Gar- 
man Anxiety Scale from among variables 
measured by the Strong VIB, the Cattell 16 
P.F., the Edwards PPI, the AVL Study of 
Values, and the McQuitty Health Question- 
naire scales. In addition to the vocational 
interest, Personality factor, and values cor- 
relates of anxiety thus identified, several anx- 
jety scales on the 16 P.F. and the Health 
Questionnaire were found to be significantly 
related to Garman Anxiety. Four year test- 
retest correlation was .51, 


Received September 23, 1957. 


References 


1. Allport, G. W., Vernon, P, 
Manual for the Study of 
Houghton Mifflin, 1951, 

2. Brackbill, G, & Little, K. B. MMPI Correlates 
of the Taylor Scale of Manifest Anxiety. J., 
consult. Psychol., 1954, 18, 433-436. 

3. Buss, A. H. A follow-up item analysis of the 
Taylor Manifest Anxiety Scale, J. clin. Psy- 
chol., 1955, 11, 409-410, 


4. Buss, A. H,, Wiener, Durkee, A., & Baer, 
The measurement 


E, & Lindzey, G. 
Values. New York: 


or the 16 P. F, Test. 


Institute for Personality and 
Ability Testing, 1950, 


6. Darley, J. G. Clinical as 
of the Strong Vocation 
York: Psychological 

7. Davids, A. Relations 
measures of anxiety 


‘pects and interpretation 

al Interest Blank, New 

Corp., 1941, 

among several objective 

under different conditions 
of motivation. J, consult, Psychol., 1955, 19, 
275-279, 

8. Edwards, A. L. Manual for the Edwards Per- 
sonal Preference Schedule. New York: Psy- 
chological Corp., 1953, 


9. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19, 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


Glen D. Garman and Leonard Uhr 


Eriksen, C. W., & Davids, A. The meaning and 
clinical validity of the Taylor Anxiety Scale 
and the Hysteria-Psychasthenia Scales from 
the MMPI. J. abnorm. soc. Psychol, 1955, 
50, 135-137. : EETA 

L. The effect of medical edu 

gen eed students’ attitudes. J. Med. Educ., 
1955, 30, 359-566. 

Gallagher, J. J. Manifest anxiety changes m 
comitant with client-centered therapy- : 
consult. Psychol., 1953, 16, 443-446. 5 

Gleser, G., & Ulett, G. The Saslow Soreang 
Test as a measure of anxiety proneness. J. 
clin. Psychol., 1952, 8, 279-283. . 

Holtzman, W. H., Calvin, A. D., & Bitterman, 
M. E. New evidence for the validity of Tay- 
lor’s Manifest Anxiety Scale. J. abnorm. sot. 
Psychol., 1952, 47, 853-854. ee 

Hoyt, D. P, & Magoon, T. M. A vaie 
study of the Taylor Manifest Anxiety Scale. 
J. clin. Psychol., 1954, 10, 357-361. on j 

Kelly, E. L., & Fiske, D. W. The predicion U 
performance in clinical psychology. Ann 
bor: Univer. Michigan Press, 1951. ie 

Kerrick, Jean S. Some correlates of the Toy P 
Manifest Anxiety Scale. J. abnorm. soc. PSY 
chol., 1955, 50, 75-77. Pe. 

McQuitty, L. L. Health questionnaire. Univ 
Illinois, 1952, (Mimeo.) i 

Shoemaker, H. A, & Rohner, J. H. Relate 
between success in the study of medicine 4 T: 
certain psychological and personal data. 
Assn. Amer. Med. Coll., 1948, 23, eae. 

Siegman, A. W. Cognitive, affective, and eet 
chopathological correlates of the Taylor er 
fest Anxiety Scale. J, consult. Psychol, 1 
20, 137-141. Heke 

Sinick, D. Two anxiety scales correlated Foe 
amined for differences, J, clin. Psychol., 

12, 394-395, ee 
Strong, E. K., Jr. Vocational interests of anh 
and women. Palo Alto: Stanford Un! 

Press, 1943. 

Taylor, Janet A, A personality scale of 
fest anxiety. J. abnorm. soc. Psychol, 
48, 285-290, nifest 

Taylor, Janet A. Drive theory and pa 
quniety. Psychol. Bull, 1956, 53, 303-320 

Welsh, G. S. An anxiety index and an vist 
nalization ratio for the MMPI. J. © 
Psychol., 1952, 16, 65-72. L. M 

Whitaker, W, L., Kelly, E. L, & Uhr, P 


mani- 
1953, 


fes- 
Michigan project on the prediction of Padua 
sional competence in medicine, J. med. 
1957, 32, 187-196. F. 
Windle, C. The relationships among five 1955: 
“anxiety” scales, J, consult. Psychol, 
19, 61-63. J. clin. 
Winne, J. F, A scale of neuroticism. 7 


Psychol., 1951, 7, 117-122. 


Journal of Applie > 
Vol, 42, No. d, 1958 S 


Contextual Effects in Scaling * 


Bernard J. Fine and Donald F. Haggard 


Quartermaster Research & Engineering Center Laboratories, Natick, Massachusetts 


Pos and Thurstone (7) have recently 
i a method for the selection of de- 
tiptive words and phrases for use as anchor 
“he on subsequent successive interval pref- 
i. scales. : One assumption underlying 
of ir method is that the empirical meaning 
‘ia word remains approximately constant 
oe oe a particular contextual framework. Ac- 
eee , they propose that adjectives scaled 
Pains the context of “food” may be used in 
ing specific food items without an ap- 
Preciable change in empirical meaning. 
et variation in the meaning of adjec- 
“= may be limited by restricting the con- 
oe within which they are scaled to a generic 
a m, “food,” there is still reason to suspect 
re variation in meaning when the 
fee adjectives are later applied to specific 
SE Pa within that class. Helson’s studies (4) 
e adaptation level concept indicate that 
ome general frame of reference would vary be- 
te lists of items when the lists are at dif- 
pos preference levels. This suggests the 
rate ility that the frame of reference also 
ae between specific items at different 
ing erence levels so that the empirical mean- 
var gt descriptive adjectives on a scale might 
eS, Considerably with the specific item be- 
rated. 
Ango newhat related to this pos 
aE of Hovland and Sherif ( 
affect, of judges rating potential sca 
tive] S the scale values even within t 
Y narrow context of “Negro.” ir 
in general context of “food,” then, it 
ude, be assumed that the food biases of 
E a rating potential scale items would af- 
Pose e scale values of those items. The pur- 
eter Of the present study was; therefore, to 
s mine the effect of specific contextual lev- 
upon the empirical meaning of some de- 


ition is the 
5) that the 
le items 
he rela- 
Within 


String: 

ae words and phrases used in scale con- 
F ction. More specifically, it was hypothe- 

i usek, J. M. 


Megin’ authors are indebted to E. R. Di 
cism of p 2nd W. H. Teichner for their h 
the initial draft. 


elpful criti- 


247 


sized that the scale values of adjectives rated 
in the context of “food” would increase sig- 
nificantly when rated in the specific context 
of a highly acceptable food and decrease sig- 
nificantly when rated in the context of an un- 
acceptable food. 

Basic to the study of contextual effects is 
the determination of the consistency of em- 
pirical meaning over repeated ratings within 
the same context, as inconsistency might ob- 
viate any examination of changes due to vary- 
ing contexts. Several studies have indicated 
repetitional factors that might lead to changes 
in empirical meaning. Thurstone (9) has 
suggested that repetition results in less dis- 
crimination due to boredom, with a subse- 
quent increase in the dispersion of scale 
values. Jones (6) obtained results which 
corroborate this assumption. His studies in- 
dicated that repeated administration results in 
an increase in the error variance with no 
ange in the level of response. 
With regard to the average level of empirical 
meaning, Guilford (2) has assumed, based on 
Helson’s adaptation level studies, that the 
total range of stimuli to which the S is ex- 
posed during the experiment influences the 
level of his responses. Thus, the S, during 
the experiment, develops a “central standard 
level” as a frame of reference for all of his 
judgments. Such an effect might be supposed 
to be almost immediate when a comparatively 
small number of items is used, eg., Jones 
(6). However, when a fairly large number 
of items are involved, the composite standard 
might vary during the first administration as 
items at different preference levels are en- 
countered and on repeated administrations as 
increasing numbers of items are recalled. 
Since the present study used a large number 
of items, it was necessary first to determine 
the consistency of empirical meaning over re- 
administrations so that the effects of 
d be more validly ex- 


appreciable ch 


peated 
changing context coul 


amined. 


248 


Method 
Subjects 


The Ss were 145 female clerks, secretaries, and 
stenographers at the QM Research & Engineering 
Center, Natick, Massachusetts, 


Scale 


The 51 words and phrases and the nine-category, 
successive-interval schedule used by Jones and Thur- 
stone (7) provided the basis for S’s judgments. 
Four forms of the questionnaire, each containing the 
same words and phrases in different random orders, 
were used. Forms A and B contained identical in- 
structions: “In this test are words and phrases that 
people use to show like or dislike for food. For each 
word or phrase make a check mark to show what 
the word or phrase means to you.” The instructions 
for Forms C and D Were changed only by substitut- 
ing the name of a Particular food item in Place of 


Bernard J. Fine and Donald F. Haggard 


the word “food”; Form C contained the words 
“roast beef” and Form D the words “stewed kid- 


neys.” Each of the forms included appropriate ex- 
amples, 


Procedure 


Form A was administered individually to all Ss 
during the first week of the study and Form B to 
the same Ss one week later. Two weeks after filling 
out Form B, one half of the Ss completed Form C 
and the other half, Form D. 

Several Ss were omitted for administrative reasons 
during the course of the experiment. In addition, 
Ss showing inconsistent performance, based on the 
criteria established by Jones and Thurstone (7), 
were eliminated from the study. d 

The psychophysical method presented by Edwards 
(1) for scaling by successive intervals was used to 
determine the scale values of the items on each say 
tionnaire. Words showing cumulative proportions 0 


Table 1 
Rater Reliability and Algebraic Deviations of Scale Values 
Rater Reliability Algebraic Deviations? 
(n = 129) (n = 35) 
a o = “ood to 
Word Bounce Bobet “Beet? kidney” 
2. Despise 94 94 = nit 
3. Loathe 93 96 = om 
4. Best of all -90 94 = 7 
5. Dislike extremely 89 93 = ~ 
6. Dislike intensely .88 .92 = = 
7. Excellent 87 93 = = 
8. Favorite 83 82 = ~ 
9. Like intensely -76 84 = a 
10. Dislike slightly 75 81 +.33 — 43 
11. Like slightlya 45 ‘80 —07 +.05 
12. Mildly likes 70 74 —04 -14 
13. Dislike very much -69 ‘84 = 
14. Strongly dislike 09 ‘81 = 5 
15. Like extremely .69 ‘80 = = 
16. Terrible “68 Pa = — 
17. Mildly dislikes 68 73 T —.67 
18. Wonderful .67 ‘81 ia = 
19. Like quite a bita 67 ‘30 - 02 
20. Like very wells .67 ‘67 age 2 ‘88 
21. Very bad .66 ‘80 ra F- 
22. Especially goods .62 75 ay 6 
23. Like very much 59 70 te — 36 
24. Highly unfavorable .58 3 Tog = 
25. Goode 58 ‘30 Le 07 
; +.08 +0 
P Wer comin a Fon 


a negative change in scale value, 


extremes 
“Any of .23 is necessary for signific: 


‘ance at the 01 level, 


m A with Form B, 
and could not be scaled, 


es 
— valu 
+ values indicate a positive change and 


| 


Contextual Effects in Scaling 


249 


Table 1—Continued 


Rater Reliability gebrai inti 

Ae aed cig ad 

_ 

Lower Upper F F 

Word Bound? Bound ee pl, 
26. Highly favorable* 57 .72 — =— 
27. Like moderately* 7 -69 —.48 —.79 
28. Like fairly welle .57 .61 =.17 =s 
29. Very good* 56 66 +.40 <a 
30. Strongly like" 56 -66 +.61 -37 
31. Dislike moderately* 54 71 +17 —.67 
32, Bad 54 -69 +.35 —.12 
33. Average” 54 .65 +.69 —.37 
34. Pairs 53 ak +.36 — 47 
35. Likes 53 .70 —.15 —.27 
36. Acceptable” 50 .69 +.45 +.34 
37. 0. Ke 50 67 —.30 —.21 
38. Mighty fines 9 65 +.34 +.21 
39. Not pleasing* AT 58 +45 —.68 
40. Enjoy 46 -63 +.41 — 36 
41. Pleasing® 46 56 +13 —.05 
42. Tasty 45 65 00 ~.23 
43. Only fairs 2 62 +45 +.01 
44. Like not so much* 42 52 +.12 — 43 
45. Poor Al 63 +.18 —.19 
46. Preferred’ 40 62 +.72 — 42 
47. Dislike 40 02 +.32 13 
48. Welcome* AO 58 —.39 —.61 
49. Don’t like .39 .60 +.37 —.18 
50. Don’t care for it® 38 -60 +.13 —.09 
51. Like not so well® 32 52 +.21 —-56 


r 

cnr in excess of .50 in the extreme atego 
Yo Phila scaled by this method and wets paitia 
BraDhic pt was made to derive scale va y 
enough methods, since they were not deemed exac 
ting Sa for the purposes of this study. After omit- 
the sepnsealable items, 31 items were available for 
a ale value comparison of Form A with Form B 
For: 35 items for the comparisons of Form B with 

ms C and D. 


r Results and Discussion 
Onsistency of Response 


he Primary purpose of the first phase of 
Study was to determine the consistency 
ration raters? responses over repeated adminis- 
empiric and the effect of repetition upon the 
eter ical meanings of the adjectives. To 
‘Po mine the consistency of the raters ree 
"ses from Form A to Form B, ordinal 
ers were assigned to the nine-point scale 


T 
this 


upon which the raters made their judgments 
and Guttman’s (3) analysis for qualitative 
data was applied to obtain the upper and 
lower bounds for the reliability of each of the 
51 adjectives. The resulting coefficients are 
presented in Table 1. It will be seen that, 
while the lower bound coefficients are all sig- 
nificantly greater than zero, the majority are 
sufficiently low to indicate considerable in- 
consistency in rating any one adjective. This 
inconsistency might be due to (a) ambiguity 
of the subjective meaning of the adjectives, 
(b) an increase in the number of random re- 
sponses on the second administration due to 
boredom, or (c) variations in the level of 
response due to changes in the composite 
standard. 

Jones (6) has demonstrated a normal error 
distribution for repeated ratings of items, so 
ambiguity of the subjective meanings would 


250 


not be expected to affect the resulting unit of 
scale measurement. However, Jones found 
that an increase in the number of random 
responses over repeated administrations re- 
sulted in an increase in the unit of measure- 
ment. To determine the effect of this factor 
in the present study, the covariation of the 
scale values from the two forms was studied. 
Scale values for the adjectives based upon the 
distributions of response on Form B were 
plotted against scale values based upon dis- 
tributions of response on Form A as shown in 
Fig. 1. The coefficient of correlation is .99 
and the values do not deviate significantly 
from a linear function. The slope, 1.22, of 
the line of best fit indicates that the unit of 
measurement based upon the responses on 
Form B is about 20% larger than on Form A. 
This is consistent with Jones’ (6) finding of 
an increase in size of unit of measurement. 
To test for a change in the level of the re- 
sponses that might be attributed to changes 
in the composite standard of the raters, a ¢ 
test for related measures (8) was calculated 
for the scale values of the two forms. The 
resultant ¢ of 1.51 with 30 degrees of freedom 
was not significant at the .05 level of signifi- 
cance. Thus, on the basis of these tests one 
might conclude that further administrations 
within the same context could lead to fur- 


me 2 bee 


A 
7 
ull ray 


zoot 


SCALE VALUE ON FORM B 


LINE OF EquaLiTy 


+300 = 400 


m3009 -200 705 i 


108308 
SCALE VALUE ON FORM A 


Fic. 1. Scale values b 


B plotted against scale 
to form A, 


300 aoo siS 


ased upon responses to form 
values based upon responses 


Bernard J. Fine and Donald F. Haggard 


SCALE VALUE ON FORM G OR FORM O 
o 


O FORM B COMPARED WITH FORM C 


© FORM B COMPARED WITH FORM D 


4th 
“0-100 e o a 300408500 
SCALE VALUE ON FORM B 


Fic. 2. Scale values obtained under specific cora 
texts of “roast beef” (Form C) and “stewed karo 
(Form D) compared with scale values derived uni 
the general context, “food” (Form B). 


£ ; A i not 
ther increases in response dispersion but 
changes in response level. 


Contextual Effects 


To examine the contextual effects, ne 
values obtained from the responses of a hes 
der the specific context “roast beef wen 
compared with the scale values derived pe 
the responses of the same Ss under the = 
text of “food”; a similar comparison ific 
made for Ss responding under the Sper 
context of “stewed kidneys.” These rela aa 
ships are plotted in Fig. 2. Neither rele sel 
ship deviated significantly from a linear a È 
tion. In both cases, the correlation CO : 
cient was .99 and the slope of the line ee 
fit was approximately 1.0, indicating tha aA 
unit of measurement was approximately e4 
for the three forms.2 el of 

To test for differences in the mean lev ori- 
response due to changes in the contextus s 
entation of the raters, algebraic deviatio 
the scale values were obtained between ons 
control and test conditions. The devia 
Se 


« dicati” 
2 Such a finding cannot be interpreted as ind the 


that the boredom effect remains constant f a more 
Second administration. The introduction bi uity.° 
Specific context may have decreased the am ei m 
the descriptive phrases thereby interacting ny Pe 
creased randomness of response to negate # 

sible change, 


Contextual Effects in Scaling 


the majority of the deviations are in the pre- 
dicted direction. A é test for related meas- 
ures (8) was calculated for each comparison. 
Both #’s (4.58 and 6.17 with 34 degrees of 
freedom) are significant at the .01 level of 
significance indicating verification of the hy- 
Potheses, 

While a number of scaling procedures as- 
Sume that the empirical meaning of an adjec- 
tive remains fairly stable within a particular 
Context, the results of the present study sug- 
Sest that this assumption may not always be 
Valid, Thus, within the “restricted” context 
of “food,” the further restriction of context 
oH the introduction of specific foods has been 
Se tiated to increase, or decrease, the 
R € values by a ‘constant amount; this 
le ange seems to be directly related to the 

vel of acceptance of the contextual food. 
tes, finding suggests that an empirical de- 
$ mination of the extent to which a context 
app stricted may be necessary before valid 
P Plication of a derived scale can be made. 
in other suggests that when a scale derived 

ikd Particular context, e.g., “food,” is ap- 

fatin as a continuous interval scale to the 

Spe a of seemingly homogeneous items, e.g., 

à “lic foods, the resultant scale values may 

cern eccurate due to the raters’ biases con- 
ning the rated items. 


i are presented in Table 1. It can be seen that 


Summary 


a - study was designed to determine the 
em Ct of specific contextual levels upon the 
Cone meaning of adjectives used in scale 
scapa ecton. It was hypothesized that the 
tert Values of adjectives rated in the con- 
Whe; of “food” would increase significantly 
aa Tated in the more specific context of a 
cantly acceptable food and decrease signifi- 
cep, > When rated in the context of an unac- 
Ptable food, 


251 


Four forms of a scale, identical in content 
but varying in order of words, were adminis- 
tered to 145 female Ss. All Ss received Forms 
A and B, which were in the context of “food,” 
one week apart. Two weeks later, one half of 
the Ss received Form C in the specific context 
of “roast beef” and the other half Form D in 
the specific context of “stewed kidneys,” 

Analysis of the results indicated that the 
hypotheses were verified. 

Findings regarding consistency of responses 
are also presented and discussed, 


Received July 8, 1957, 


References 


1, Edwards, A. L. The scaling of stimuli by the 
method of successive intervals. J. appl. Psy- 
chol., 1952, 36, 118-122. 

2. Guilford, J. P. Psychometric methods. 
York: McGraw-Hill, 1954. 

3. Guttman, L. The test-retest reliability of qualita- 
tive data. Psychometrika, 1946, 11, 81-95. 

4. Helson, H. Adaptation-level as a basis for a 
quantitative theory of frames of reference. 
Psychol. Rev., 1948, 55, 297-313. 

5. Hovland, C. I, & Sherif, M. Judgmental phe- 
nomena and scales of attitude measurement: 
Item displacement in Thurstone scales. J. ab- 
norm. soc. Psychol., 1952, 47, 822-832. 

6. Jones, L. V. Psychophysics and the normality 
assumption: an experimental report. In D. R. 
Peryam, F. J. Pilgrim, & M. S. Peterson 
(Eds.), Food acceptance testing methodology 
symposium. Chicago: QM Food & Container 
Institute, 1953, pp. 105-112. 

7. Jones, L. V., & Thurstone, L. L. The psychophys- 
ics of semantics: an experimental investiga- 
tion. J. appl. Psychol., 1955, 39, 31-36. 

8. Lewis, D. Quantitative methods in psychology. 

Iowa City, Iowa: Gordon Bookshop, 1948. 
. Thurstone, L. L. Some new psychophysical meth- 
ods. In D. R. Peryam, F. J. Pilgrim, & 
M. S. Peterson (Eds.), Food acceptance test- 
ing methodology symposium. Chicago: QM 
Food & Container Institute, 1953, pp. 100- 


104. 


New 


Applied Psychology 
Wa, Woh 1958 


Personality Characteristics Related to Leadership Behavior 
in Two Types of Small Group Situational 
Problems * 


Walter R. Borg 
Utah State University 
and Ernest C. Tupes 


Personnel Laboratory, Wright Air Development Center 


The situational test involving small groups 
of Ss has made marked gains in popularity in 
the course of the last few years. Certain basic 
problems exist, however, for which we must 
find better solutions than we now have before 
the small group situational test can be used 
with confidence as a measurement tool. Per- 
haps the most important problem is concerned 
with the degree to which characteristics of the 
individual S determine the $’s role in small 
group activities. The “great man” theory of 
small group leadership considers the charac- 
teristics of the individual to be a major factor 
(1, 3). Other theories emphasize the nature 
of the small group situation or the interaction 
with other group members as the major vari- 
able in small group behavior (2, 4, 5, 7, 8, 9, 
10). If the “great man” theory holds even 
to a limited degree (that is, when type of 
problem and composition of group are varied 
only within moderate limits) it would permit 
the prediction of certain aspects of the S’s 
small group behavior such as leadership emer- 
gence without knowing the Precise nature of 
the small group situation or the specific indi- 
viduals making up the remainder of the group. 
This would be particularly advantageous in 
military selection and classification where the 
“tailor making” of a group is not generally 
feasible. 

Implicit in the above are several hypotheses 
concerning certain aspects of small group be- 
havior and situational testing. The present 
study was designed to test the following spe- 


‘The experimental work fo: 
ried out under the Air Force 
Research Center in support 
17008. Permission is granted for reproduction, trans- 
lation, publication, and use or disposal in whole or 
in part by or for the United States Government, 


r this study was car- 
Personnel and Training 
of Project 7719, Task 


cific hypotheses, which are stated as questions 
to provide greater clarity: 


1. Is small group leadership behavior 
measured by observer ratings related to p 
sonality characteristics as measured by per 
sonality trait ratings? 

2. Are there differences in the levels of ne 
lationships between personality characteristies 
and leadership scores on two different type 
of situational tests? lit 

3. Are relationships between personality 
characteristics and leadership scores on a 
tional tests a reflection of overall halo or i 
they indicate that specific personality chara > 
teristics are related to small group seed = 3 

4. Do the patterns of relationships Pa 
tween personality characteristics and lea g5 
ship scores on two different types of situ 
tional tests differ significantly? ine 

5. Does the degree or the pattern of lity 
relationship between ratings of persona’! a 
characteristics and leadership scores chinh 
significantly when the type and amount ° 


sect is 
contact between the rater and the subjec 
changed? 


Method 
Subjects 


i igatio® 
The data analyzed in the present investiga 
were gathered in connection with an assess™ Scho 
all male members of USAF Officer Candidate 
Classes 1955B and 1955C. The study is paradiz 
125 male graduates of Class 55B and 96 male £ were 
ates of Class 55C for whom complete test data peen 
available. Before entering OCS the Ss ba! high 
screened in such a way that all were at leas isted 
School graduates with at least one year Of C" rouP 
military service. Eighty-five per cent of the Pula- 
would fall in the upper 10% of the general pa ma- 
tion with respect to general intelligence, and 


als a, . ; orcê- 
Jority were planning on a career in the Air F 


252 


ke 


4 


Personality Characteristics and Leadership Behavior 253 


Procedure 


ee the Ss in each class reported to the school 
1935) a late December 1954 and 55C in March 
ee ey were assigned to six-man teams, in such 
Galm Be: no member of a team had been ac- 
aiia with any other team member prior to re- 
Rees g to OCS. Fellow team members were in very 
HA rse during the three and one-half day test- 
Sie and were isolated from other teams, ex- 
meh at they shared a barracks floor and ate with 
fae of one other team. Each team was ad- 
Fa a the group performance tests, which make 

At re the variables in the present study. 
Stor “i close of the testing period, the Ss were re- 
Bach os into OCS flights of about 20 candidates. 

x flight was supervised by a Tactical Officer and 
tet corresponding flight of OCS upperclassmen. 
Aende of a flight were quartered together and 
Dei classes and participated in other types of 
Shu & as a unit. In reforming the assessment 
iG ss into OCS flights, care was taken so that no 
ae embers of any assessment group were assigned 

€ same OCS flight. 


Variables 
Png group leadership tests. 1. Project X Leader- 
sisted est. This small group leadership test con- 
Prob] of 12 situational construction problems. These 
in de all required cooperation of team members 
vided nts and carrying out a solution, and pro- 
e disp situation in which leadership behavior could 
Place, Spied, In a typical problem, the team was 
across in a prison compound and required to escape 
Was a moat and over a solid board fence. A ladder 
SOR but it was not long enough to reach 
eVeral e edge of the moat to the top of the fence. 
ertain short lengths of rope were also provided. 
and c areas, such as the moat, were painted red 
y aa not be touched. The problem was solved 
With paid the ladder at an angle over the moat 
er a c ropes. One team member climbed the lad- 
e a jumped to the fence. Once over the fence 
o ees additional props which could be used 
limit p e remainder of the team to escape. A time 
Dosed 12 minutes of actual working time was im- 
Cient fia each problem, which was not usually suffi- 
lem e complete the problem. Solution of the prob- 
Used "= not necessary, however, as the rating scales 
ers Do directed toward behavior of team mem- 
s ather than solution of the problem. The prob- 
the oo administered by Air Force officers attached 
the Officer Candidate School. Two officers were 
i the first six 
We is. At the end of six roblems, the officers 
tered shifted and the last six problems were adminis- 
assign, by two other officers. Six upperclassmen were 
assigned to cach team to act as raters. A rater was 
on a to each S in the team, and rated only this 
Taters given problem. At the end of each problem, 
Were rotated, so that each S was rated twice 


by each rater in the course of completing the 12 
problems. 

The rating device used was a behavior check list 
consisting of 11 categories, each concerned with one 
aspect of leadership behavior, The rater was asked 
to place a check mark for the man he was observing 
each time the man displayed one of the specific be- 
haviors in the check list. Scores were obtained for 
each S by summing the checks he received across all 
12 problems. 

Each of the four officers supervising the team dur- 
ing the 12 problems made independent general rat- 
ings of leadership ability based on their observations 
of the Ss. These ratings were combined with the 
check list rating scores to obtain the Project X lead- 
ership scores used in this research. This score has 
an estimated reliability (based on rater agreement) 
of .80. 

2. Leaderless Group Discussion. This test was 
used and scored in essentially the form outlined by 
Bass (1). A topic (In the event of all-out mobiliza- 
tion, should the Air Force rely on the AFROTC pro- 
gram or a greatly expanded OCS for its major source 
of new officers?) was assigned to each group. Thirty 
minutes were allowed for discussion, Four observers 
rated each S on each of nine leadership behavior 
items, using a five-point rating scale. 

Analyses indicated that the four raters agreed mod- 
well on the ratings of the separate items. 
The items were found to intercorrelate fairly highly 
indicating the presence of a large general factor. A 
total Leaderless Group Discussion score was com- 
puted for each S by summing the ratings he received 
across all nine items and all four raters. Based on 
the agreement between raters, the total LGD score 
had an estimated reliability of .82. 

Personality ratings. Ratings on a number of bi- 
polar personality traits similar to those of Cattell 
(6) were obtained on each S on three occasions. 
The first set of ratings was obtained from two upper- 
classmen who observed the S’s behavior in six role- 
playing interpersonal problem situations dealing with 
military administration, management, and human re- 
lations. The total time of observation on which the 
personality ratings was based was less than one hour. 
These ratings, designated below as the 1-Hour rat- 
ings, had an estimated average reliability of .35 based 
on the agreement between the two raters. 

A second set of personality ratings was made by 
members of the S’s 6-man team and by members of 
the team with whom he had shared the barracks 
floor, at the end of the 34-day assessment period. 
These ratings are designated hereafter as the 3-Day 
ratings. Their average reliability, as estimated from 
the agreement among the 12 raters, is .45. 

The third set of personality ratings was obtained 
at the end of the six months’ OCS period. All mem- 
bers of each flight (15 to 20) rated each other. These 
ratings are called the 6-Month ratings. Their aver- 
age reliability, based on the agreement among raters, 


is about .85. 
These three sets 0 


erately 


f ratings are experimentally in- 


254 


dependent since no rater rated any S on more than 
one set of ratings. Further, raters on any one set 
of ratings had no opportunity to observe their Ss in 
the behavior situations on which any other set of 
ratings was based. Thus, the 6-Month raters had 
not observed their Ss during the assessment nor had 
the 3-Day raters observed their Ss during the situa- 
tional problems on which the 1-Hour ratings were 
based, etc. 

Based on the intercorrelations among similar rat- 
ings obtained on several earlier OCS classes, the rat- 
ings of the present study were combined into 12 
clusters, containing from one to seven variables each. 
These clusters are listed below. 


1. General Adjustment. 
2. Extroversion. 
3. Effective Intelligence. 
4. Determination. 
5. Assertiveness. 
6. Social Maturity. i 
7. Lack of Neuroticism. 
8. Unconventional-Conventional. 
9. Attentiveness to People. 
. Insistently Orderly. 
. Adaptability, 
. Energetic. 


Analysis of Data 


Pearson Product-Moment correlations were com- 
puted between the scores on each situational test and 


Obtained Under 


Project X Scores 


Personality 
ariable 


11 27** 
3. Intelligence 00 10 
4. Determination 04 19** 
5. Assertiveness 15* 30** 
6. Social Maturity 07 4g** 
7. Lack of Neuroticism 04 24** 
8. Conventionality 00 16* 
9. Attentiveness 06 04 
10. Orderliness 15* 08 
11. Adaptability —07 05 
12. Energetic 13* 27** 
Average 05 18** 
coea ote — Relationships expressed as Fisher's g equivalents 


* Significant at the 5% 


i level, 
** Significant at the 1% 


level, 


Several Conditions 


Rating Condition 


Walter R. Borg and Ernest C. Tupes 


the three sets of ratings on each personality trait 
cluster. These 72 correlation coefficients were con- 
verted to Fisher’s z equivalents, and the specific null 


hypotheses tested by a triple classification analysis 
of variance, 


Results 


In Table 1 are shown the Fisher’s z equiva- 
lents of the correlations between the Project X 
and LGD scores and each trait variable as 
rated under each rating condition. These re- 
lationships are not independent so that a com- 
parison of the obtained distribution of 2’s 
with that expected by chance is not appro- 
priate. However, the fact that 36 (one-half) 
of the 2’s are significant at or beyond the 5% 
level, and the fact that the average z based on 
the whole table is significant at the 5% level, 
make it fairly safe to give an affirmative an- 
swer to the first hypothesis. It may be con- 
cluded that some relationship exists between 
personality characteristics and performance 
on small group leadership tests. j 

Table 2 presents the results of the analysis 
of variance designed to test the remaining 
hypotheses, 


The significance of the main effects of prob- 


LGD Scores 
e 
Rating Condition i All 
= 13* —(Q2 00 06 Oe 
35** 21** 296%  22** egi 
04 19** 299% ga i 
—04 11 10 13* 0 iet 
27** 19** oge 22» 2 fee 
11 15* 31**  32** E 
o5 12 09 14* 08 
14* —02 11 10 08 
—01 13* 16* 09 06 
—07 14* 06 02 06 
03 18** 3 18** a1** 
15* 22** 31%" 19** 
* 
07 13* 16* 16* 13 


(with decimals o 


La 


Personality Characteristics and Leadership Behavior 255 


Table 2 


Analysis of Variance of Relationships Between 
Leadership Scores and Personality 
Trait Ratings 
-E AEE O 
Variance 


Een Type 1 465.1 66 .05 
re Ray 2 3808 54 05 
1 abl 11 456.2 6.5 001 
ane ating 2 1950 28 NS 
Terex Trait i 7 = = 
ee Rating 2 282 — — 
roblemXRatingXTrait 22 70.3 
Total 71 


«| E I EE 


= are and trait variable confirms the sec- 

7 pe third hypotheses. It appears that 

ality S of the relationship between. person- 

a rait ratings and situational test per- 

Prob] nce is a function of both the kind of 
€m and of the specific traits rated. 

he main effect of rating type was also 


fo 
„und to be significant, indicating that the 


S correlations between trait ratings and 

tion lonal test performance 1S also a func- 
of the rating conditions. 

Fere ar of the three first order interactions 

of the ignificant, indicating that the patterns 

id et correlations between personality traits 
Situational test performance do not differ 


rating condition was varied. Further evi- 
dence of a positive nature relating to the simi- 
larity of the patterns of the relationships is 
presented in Table 3, which shows the inter- 
correlations of the patterns of correlation co- 


efficients. 
Conclusions 


The results of this study have led to the 
conclusion that the personality traits associ- 
ated with successful performance in two types 
of small group activity do not differ in rela- 
tive importance, although there is evidence to 
show that, overall, the personality character- 
istics employed in this research are more 
highly related to the Leaderless Group Dis- 
cussion than to Project X. This is especially 
true in view of the fact that Project X, which 
consists of 12 construction-escape problems, 
apparently provides an opportunity to ob- 
serve a greater range of behavior character- 
istics than does the LGD which consists al- 
most entirely of verbal (oral) activity. It 
should not be concluded from these results 
that the two types of problem yield equiva- 
lent scores or that, apart from personality 
characteristics, the same abilities are required 
for successful leadership in the two situations. 
The obtained correlation between the Project 


X and LGD scores was only .34, indicating 
that the two have only a small amount of 
variance in common, and a considerable pro- 
portion of specific variance which has been 


Si 
Snificantly when either the problem type or 


Table 3 
t-Problem Correlation Patterns for the Three Rating Conditions 


Rho Correlation Coefficients* 


Correlation Pattern 
etl ee a ae 
Project X Versus 6 55 49 61 16 
1. 1-Hour Ratings = z : . } 7 7 

2. 3-D , x 79 51 71 68 88 

m -Day Ratings x 26 71 73 81 

- 6-Month Ratings i 

Leaderless Group Discussion Versus x 63 70 74 
4. 1-Hour Ratings l X 18 8 
5. 3-Day Rati , 91 
Batis x 91 

X 


6. ; 
y 6-Month Ratings 
- Sum of 1 through 6 
a : > i lity characteristi 
asr Obtain : fficients between each problem score and the 12 personality ca 
at ed by rank- ; rrelation coeiicie A 
ed under by rank-ordering the i computing the rho's between these rank orders. 


256 


found in other analyses of the present re- 
search data to be differentially related to 
athletic-physical ability in the case of Project 
X and to general intelligence, intellectual in- 
terests, and verbal ability in the case of the 
LGD. : 
Significant differences were also found in 
the level of the correlations between the prob- 
lem scores and the personality characteristics 
which could be attributed to the source of the 
personality ratings. The 1-Hour ratings cor- 
related significantly lower than the 3-Day rat- 
ings but not significantly lower than the 6- 
Month ratings. The 3-Day and 6-Month cor- 
relations did not differ significantly. These 
differences appear to be mostly a function of 
differences in the reliabilities of the three 
types of ratings, and would probably disap- 
pear if the correlations were corrected for un- 
reliability of the ratings. 


Other analyses have indicated that the trait 


variables do not differ appreciably in reli- 
ability within an 


y type of rating condition. 
Thus, the obtained significant differences in 
their correlations with the small group leader- 
ship problems are probably a function of their 
relative importance to success in these prob- 
lems. The last column of Table 1 shows the 
average relationships of each trait variable 
across both types of problem and all rating 
conditions. Four traits (Extroversion, As- 
sertiveness, Social Maturity, and Energetic) 
are significantly Correlated at or beyond the 
1% level, and Effective Intelligence is signifi- 
cantly correlated at the 5% level, These 
traits appear to have somewhat of a social 
orientation, whereas most of the traits which 
are not significantly correlated (Adjustment, 
Determination, Orderliness, and the others) 
appear to be more personally oriented, 

With respect to the “great man” theory, 
these data seem to confirm that to some ex- 


Walter R. Borg and Ernest C. Tupes 


tent, leadership in a small group situation bs 
pends upon the characteristics brought to the 
situation by the Ss themselves, although the 
nature of the problem situation, and, per- 
haps, the nature of the group, will also help 
determine which Ss are seen as the leaders. 
Personality requirements for leadership in dif- 
ferent types of situational tests are similar 
with respect to patterns of personality i 
acteristics, although not in level, and accoun 
for most of the common variance in this re- 
search. 


Received September 30, 1937. 


References 


1. Bass, B. M., McGehee, C. R., Hawkins, W. a 
Young, P. C, & Gebel, A. S. ap mea 
variables related to leaderless group "e 
sion behavior. J. abnorm. soc, Psychol., 1953; 
48, 120-128. , ripe 

2. Bell, G. B, & French, R. L. Consistency ©: of 
dividual leadership position in small peuk $ 
varying membership. J. abnorm. soc. PSY 
chol., 1950, 45, 764-767. Pes 

3. Borgatta, E, F., Bales, R. F., & Couch, / ties 
Some findings relevant to the great man oat 
ory of leadership. Amer, sociol. Rev, 1954 
19, 755-759, iy if 

« Carter, L, F. Some research on leadership g 
small groups. In H. Guetzkow (Ed.), Groga 
leadership and men. Pittsburgh: Carneg! 
Press, 1951, 146-157. fin 

5. Carter, L. F. Evaluating the performance ei 
dividuals as members of small groups. 
sonn. Psychol., 1954, 7, 477-484. 3 PE 

6. Cattell, R. B. Confirmation and classification 
primary personality factors. Psychometri 
1947, 12, 197-200. a, 

. Cattell, R. B. New concepts for a DE it 
ership in terms of group syntality. 

Relat., 1951, 4, 161-184, aoh 

- Durkheim, E. The division of labor in soc 
Glencoe, Ill.: Free Press, 1947. 

9. Moreno, J. L, Who shall survive? 

New York: Beacon Hill, 1953. 3 Psy- 

Redl, F. Group emotion and leadership. 75) 

chiatry, 1942, 5, 573-596. 


Beacon, 


10. 


ny 


Journal of A S 
Vol. 42, wa Pplied Psychology 


Wrapper Influence on the Perception of Freshness in 
Bread 


Robert L. Brown 


Furman University 


= "neal es consumer research (1) it has 
in Bead b that the two properties looked for 
ness and y the majority of people are fresh- 
largely p flavor. . Freshness is determined 
and A y “feeling” of the loaves of bread 
Pas ce is determined by taste. At the 
number time, breads are being wrapped ina 
Tom a of different type wrappers ranging 
ars cellophane to a heavy wax or plastic. 
Vantaas perhaps both advantages and dis- 
engaged o s to each type of wrapper. Persons 
"eported ‘tha marketing of bread have often 
etter į the opinion that some breads sell 
Ypothe. one wrapper than in another. Many 
is ae have been made with reference to 
ay of erence, but little has been done in the 
ond in Controlled research. T his is the sec- 
uence series of proposed studies on the in- 
fresh oF the wrapper on the perception of 
a in bread. 

es original study conducted at Purdue 
at a 1955 (2), it was hypothesized 
Wrappe Seal sensations aroused by the 
M bread influence the perception of freshness 
Sized tp. More specifically, it was hypothe- 
a that two loaves of equal freshness, but 
o hane at type wrappers, would be judged 

n te differential degrees of freshness. 
differe sting the above stated hypothesis, four 
Ere; ¢ Ag wrappers were selected. These 
Speci €llophane; Saran; regular wax; and a 
en, Wax with a subwrapper. The experi- 
in whic Performed in a laboratory situation 
e 16 male and 16 female students 
br aq used as Ss, Sixteen loaves of fresh 
before all baked together during the night 
Various € experiment, were rewrapped in the 
Ss e Wrappers by the experimenters. The 
table „© Seated parallel with the side of a 
With their right arm around behind a 
exp The S was instructed that this was 
Deop tment to determine whether oF not 
it; ap Can tell how fresh bread is by feeling 
at one loaf and then another would 


ad 


be presented under the S’s hand; that he 
should feel the one and then the other and 
tell the experimenter which of the two was 
fresher. No equal judgments were allowed. 

A full paired-comparison design was used 
and the pairs were randomly presented. The 
responses of the Ss were recorded on indi- 
vidual record sheets for the purpose of subse- 
quent analysis. 

From the analysis of the data, no signifi- 
cant differences were found between sex, be- 
tween sequence of presentation, between first 
and second halves of groups feeling the same 
set of loaves; or between the first and second 
halves of the total group during the experi- 
mental day. The difference between the ob- 
served and expected frequencies of judgments 
for the four wrappers gave a chi-square value 
of 26.38 which is significant beyond the 1% 
level with 3 degrees of freedom. The per- 
centage of judgments of “fresher” made by 
these 32 Ss were as follows: Cellophane, 
68%; Saran, 56%; regular wax, 42%; and 
the special wax with subwrapper, 34%. The 
percentages were determined on the basis of 
the number of judgments in favor of a par- 
ticular wrapper over the total number of 
vrapper appeared in the judgment 


times that w e e 
pairs. Since a full paired-comparison design 
was employed for the four wrappers, the per- 


centages add up to 200%. 

The original study left a number of ques- 
tions unanswered and raised some other ques- 
tions. The purpose of this second study is to 
determine the answers to two of these ques- 
tions: (a) Will the same differential influence 
of wrappers on the perception of freshness in 
bread also be found among the primary con- 


sumer group—housewives—as with university 


students and (b) Will the same differential 


influence of the wrappers on the perception 
of freshness in bread hold for one- and two- 
day-old bread as it does for fresh bread. 


257 


258 


Procedure 


In order to answer these questions, three con- 
ventional type wrappers were selected: „cellophane; 
cellophane with a five-inch waxed paper insert band; 
and wax. The cellophane wrappers were the com- 
monly used .00l-inch thick and weighed approxi- 
mately one pound per 21,000 square inches. The 
waxed paper for the wrapper and for the insert band 
was of base paper weighing 25 pounds per ream and 
waxed up to 37 pounds per ream. All wrappers 
were unprinted. ` 

Eighteen regular, one-pound, round-top, sliced 
loaves of white bread were used. Six of these were 
fresh, having been baked at the same time on the 
afternoon before the experiment; six were one day 
older; and six were two days older. All of the loaves 
were stored in their original wrappers until a few 
hours before the experiment when they were re- 
wrapped in the various wrappers for the experiment. 
The wrappers were adjusted to a degree of tightness 
(or looseness) judged to be comparable. 

The experiment was performed in a laboratory 
type situation set up in the foyer of a large super 
market. Fifty of the housewives coming to the 
market to shop volunteered to serve as Ss. 

The Ss were tested under blinded conditions made 
possible by seating each S parallel with the side of a 
table and close enough to place her right forearm 
and hand behind a screen which was mounted to the 
table. This arrangement made it possible to place 
the loaves, one at a time, under the Ss hand and 
prevented the S from seeing the loaves being judged. 

Each S was instructed that this was an experiment 
to determine whether or not people can tell how 
fresh bread is by feeling it; that one loaf and then 
another would be presented under the S’s hand; that 
she should feel one and then the other and tell the 


experimenter which of the two was fresher. No 
equal judgments were allowed. 


Table 1 


Judgments of “Fresher” Made for Each of Three 
Wrappers and Three Ages of Bread When 
Presented in a Paired-Comparison 

Design to 50 Housewives 


Age of Bread 

ees 

One- Two- 
Wrapper Fresh day day Total 
Cellophane (125) (122) (126) (373) 
62.5% 61.0% 630% 62.2%, 
Cellophane with (118) (97) (94) (309) 
waxed band insert 59,0% 48.5% 47.0% 44.1% 
Wax (57) (81) (80) (219) 
28.5% 40.5% 40.0% 36,3% 


Robert L. Brown 


Table 2 


Chi-Square Values Between Observed and Expected 
Frequencies of Judgments of “Fresher” Made 
for Each of Three Wrappers and 
Three Ages of Bread 


Variables af x? Values 
Between first and second halves of osi 
sets of loaves used 8 10. 
Between sets of loaves used in the 
first and second halves of the ex- 11.99 
perimental day 8 ‘ 
Between fresh and one-day-old 26 
bread 2 6. 
Between fresh and two-day-old 5a 
bread 2 6. 
Between one- and two-day-old 
bread 2 0.12 
Between observed and expected 
frequencies of judgments for 
three wrappers for: pos 
Fresh bread 2 git 
One-day-old bread 2 1112" 
Two-day-old bread 2 1045 
All three ages of bread 2 Š 


* Significant at the 5% level. 
** Significant at the 1% level. 


e 
A full paired comparison design was used and 
pairs were randomly presented according to & ie 
tem previously worked out for each S. ; The (fal 
sponses of the Ss were recorded on the indivi 
record sheets for subsequent analysis. 


Results 


The numbers and percentages of judemen 
of “fresher” made for each of the three wae 
pers when presented in a full paired coe 
son design to the 50 housewives under blin er- 
conditions are presented in Table 1. The Pi 
centages are determined on the basis © ular 
number of judgments in favor of a parts 
Wrapper over the total number of times airs: 
Wrapper appeared in the judgment P was 

ince a full paired-comparison desig? cent- 
employed for the three wrappers, the pe" 
ages add up to 150%. 

A full analysis of the data was made 
chi-square technique. The results of 
analyses are to be found in Table 2- 
nificant differences were found pewa 
frequencies of judgments as “fresher” f0 
first and second halves of Ss tested 0” 


the 
PY ese 

sig 
2 the 
the 
set 


Perception of Bread Freshness 


ke in the various wrappers. Fearing 
the alae differences might be found between 
ei s of loaves used in the first half and 
aa in the second half of the experi- 
No si lay, a x? test was made on these data. 
Bide ignificant differences were found. With 
i grees of freedom and an alpha equal to 
ba a single x? value of 15.5 or more would 
ae en to indicate a difference that was 
cies e to chance variations in 95 out of 100 

Beco: It will be noted from Table 2 that the 
a obtained are clearly below this value 
fion wegen differences. For an interpreta- 
Table A remainder of the tests reported in 

owing at the 95% confidence level a x? of 
aide greater is required, and at the 99% 
a a level a x? of 9.21 or greater is re- 
ead . These values are for two degrees of 

om. 


Discussion 


ig purpose of this study was to ascertain 
tion Swers to two questions. The first ques- 
of i as: Will the same differential influence 
mate on the perception of freshness in 
grou! e found among the primary consumer 
Bane housewives—as with university stu- 
Waxed The plain cellophane and the plain 
Studies wrappers were identical in the two 
“fresher” The percentages of judgments of 
read r” for the cellophane wrappers on fresh 
a eae 68.0% and 62.5% respectively for 
epl ts and housewives. The judgments for 
ents ain wax wrappers on fresh bread by stu- 
28.507 gas 42% and by housewives it was 
Compar These percentages are not strictly 
Wrap rable because the percentage for a given 
the per depends upon the other wrappers 1n 
N up. All the wrappers were not the 
in the two studies. However, it will 


$ . 
ane that the order or ranking 1S the 
The Or cellophane and wax in both studies. 

it can be 


r 
concha re, for cellophane and wax, 

Ouse led that the primary consumer group— 
iq Wives—responded in the same way as 


Mk 
cluded Sity students. It may also be con- 
that with a very high degree of confidence 
$ hness) feels 


esh bread (of equal fres 


Te; 
an Jlophane than 


en enor it is wrapped in ce 
a is wrapped in a wax wrappet- 
second question was: Will the same 


259 


differential influence of the wrappers on the 
perception of freshness in bread hold for one- 
and two-day-old bread as it does for fresh 
bread? An examination of the percentages 
of “fresher” in Table 1 shows some change in 
the magnitude of the judgments of “fresher” 
for the wax wrapper and for the cellophane 
wrapper with the wax band insert with one- 
and two-day-old bread. Although the orders 
remain the same, there is a significant differ- 
ence between the percentages on fresh bread 
and those for one-day-old bread. Likewise 
there is a significant difference between fresh 
bread judgments and two-day-old bread judg- 
ments for the various wrappers. The chi- 
square values are given in Table 2 and were 
found to be significant at the 95% level of 
confidence. No significant differences were 
found between the one- and two-day-old bread 
in percentages of judgments. All three ages 
of bread showed significant deviations from 
chance expectancies for the three wrappers. 
It may be concluded, therefore, that, like fresh 
bread, one- and two-day-old bread also feels 
fresher when wrapped in a plain cellophane 
wrapper than when wrapped in wax or cello- 
phane with a wax band insert. 


Summary and Conclusions 

experiment was to an- 
swer two questions with reference to the in- 
fluence of tactual sensations supplied by the 
wrapper on the perception of freshness in 
bread. Previous research by the author had 
revealed that for fresh bread, loaves of equal 
freshness were perceived by university stu- 
dents to be fresher when wrapped in cello- 
phane than when wrapped in wax. In this 
study the following questions were asked: 
(a) Will the same differential influence of 
wrappers on the perception of freshness in 
bread be found among the primary consumer 
groups—housewives—as with university stu- 
dents? (b) Will the same differential influ- 
ence of the wrappers on the perception of 
freshness in bread hold for one- and two-day- 
old bread as it does for fresh bread. 

In order to answer these questions, three 
conventional type wrappers and three ages of 
bread in a paired-comparison design were pre- 
sented to 50 housewives under blinded con- 


The purpose of this 


260 


ditions. The results warrant the following 
conclusions: 

1. Housewives, the primary consumer group, 
responded to the test situation in the same 
way that university students responded. They 
perceived fresh bread of equal freshness to 
be fresher when wrapped in cellophane than 
when wrapped in wax. 

2. The same differential influence of the 
Wrappers on the perception of freshness in 
bread applies to one- and two-day-old bread 
as it does to fresh bread. The magnitude of 


the judgments was not as great for one- and 


Robert L. Brown 


two-day-old bread, but the order remained 
the same and judgments still differed signifi- 
cantly from expected frequencies. 


Received October 7, 1957. 


References 


1. Brown, R. L. A consumer research study of 
bread. Greenville, S. C.: Henderson Advertis- 
ing Agency, 1953. 

2. Brown, R. L., Brune, R. L., Thackray, R, & 
Kephart, N. C. Wrapper influence on the per- 
ception of freshness in bread, I. Unpublished 
manuscript, Furman Univer., 1955. 


£ 


N, 


ar 


Journal i 
val 13, Qat plied Psychology 


Factors Associated with Vocational Interest Profile 
Stability * 


Leslie A 


. King 


The General College, University of Minnesota 


Th he numerous investigations have been 
ee the stability of Strong Vocational In- 
thee lank (SVIB) scores, relatively little 
be cen done in studying factors that might 
po Predictors of interest permanence. If it is 
a to determine which individuals could 
a tea to have stable vocational interests 
he, which individuals could be expected to 
the punile interests, the validity of using 
ing y IB in educational-vocational counsel- 
A a be enhanced and time and money 
eresi in retesting individuals with stable in- 
stud S would be saved. The purpose of the 
DA was to determine which of 12 types of 
len mation available about a college fresh- 
Rec are useful in predicting SVIB profile 
Ure ee The SVIB profile stability meas- 
calle a. one developed by the investigator and 
hs an S score. A description of the de- 
me's of the S score as well as normative 
and fox, a study of relationships between $ 
Previ er profile stability measures has been 
‘ously published (4). 


Method 


beets The 242 subjects of this study were all 

eral Col high school graduates who entered the Gen- 
fall qu lege of the University of Minnesota in the 
ing ¢ arter, 1954, as freshmen, took the SVIB dur- 
Start 3 orientation-registration program prior to the 
e spri classes, and completed their third quarter in 

from tee quarter, 1955, The subjects ranged in age 
Ministr -5 to 27.5 years at the time of the fall ad- 
‘ees of the SVIB with a median age of 19.0, 

ra age of 20.0, and a standard deviation of 2.6. 

ne in high school percentile ranks (HSR) 
om 1 to 77 with a mean rank of 27.0 and a 
dard deviation of 16.4. The SVIB Interest-Ma- 
(T-M) mean score was 48.8 with os 

$E . The General Aptitude Test Bai- 
A i (Gare) G mean score was 107.0 with a stand- 
Educata tion of 11.3. The American Council on 
~ on Psychological Examination (ACE) mean 


iy g 
thesi ax Paper is based upon a portion of a Ph.D. 
Versity ubmitted to the graduate faculty of the Uni- 
knowlege Minnesota. The author wishes to ac- 
Duga dge the guidance of his advisors, Willis E. 


Sa 
^ and Cyril J. Hoyt. 
261 


score (1952 form) was 77.2 with a standard devia- 
tion of 18.0. Eighty-eight subjects (36.4%) were 
veterans and 31 (12.8%) were married. 

Procedures. The subjects were retested on the 
SVIB during the latter part of the spring quarter of 
1955. The interval between SVIB administrations 
averaged nine months. All SVIB’s were scored on 
44 occupational scales and the I-M scale. 

The two general types of statistical analysis used 
in studying the relationship between each of the 12 
factors and the S scores were product-moment cor- 
relations and analysis of variance. Correlational 
analysis was used for the following eight factors: 
(a) age, (b) I-M scores, (c) GATB G scores, (d) 
number of Primary (P) patterns, (e) number of Re- 
ject (R) patterns, (f) ACE scores, (g) HSR, and 
(k) Depth index (developed by Hoyt, Levy, and 
Smith [3]). 

The analysis of variance technique was used for 
the following four factors: (a) socioeconomic status, 
(b) congruence of stated occupational goal with 
measured interest pattern, (c) veteran status, and 
(d) marital status. The hypothesis of homogeneity 
of variance was tested for each factor before the 
analysis of variance was made. 

The socioeconomic status of each student was de- 
termined by utilizing Edward’s Social-Economic Scale 
(2, pp. 176-180). This scale classifies a large num- 
ber of occupations into six categories on the basis 
of the social prestige and economic level of each oc- 
cupation. Each student’s parental occupation was 
classified into one of the six categories and analysis 
of variance was made for the six categories on the 
mean S scores. 

The relationship between a student’s stated occu- 
ational goal, measured interest pattern, and inter- 
est stability was investigated in the following man- 
ner: (a) Each student’s stated occupational goal was 
classified into one of the 11 SVIB groups. (b) After 
the occupational choices of the students were classi- 
fied into a specific SVIB group, such as Group I, the 
students in each group were classified into four cate- 
gories according to whether or not they had a Pri- 
mary, Secondary, Tertiary or Reject pattern in that 
group. (c) The S scores for the four categories were 
tested for equality by the analysis of variance. 


p 


Results 


Correlational analysis. The correlation co- 
efficients for only two factors, number of P 
— .17) and Depth index (7 = 


patterns (7 = 
ficant at the .01 level. Two 


.24), were signi 


262 


ctors, number of R patterns (r = — 
aap HSR (r= —.11) had coefficients 
which approached, but did not reach, the 05 
significance level. The three factors which 
appeared to be most promising for predictive 
purposes, Depth index, number of P patterns, 
and number of R patterns, were included in 
a multiple regression equation. The multiple 
correlation coefficient was .29 which is signifi- 
cant at the .01 level of confidence. Because 
a multiple R of .29 has little Predictive ca- 
pacity, it is concluded that the eight factors 
used in correlational analysis do not indi- 
vidually or when optimally combined enable 
one to predict with a reasonable degree of 


confidence the interest stability for an indi- 
vidual case. 


Analysis of variance, T! 


he analysis of vari- 
ance for the socioeco: 


nomic Status, veteran 
Status, and marital Status factors indicated 


in each case acceptance of the hypothesis of 
equality of mean scores, In analyzing the re- 
lence of stated occu- 
red interest Pattern 
the analysis of variance was 


, IL IV, V, VIII, IX, 


An example of the application 


44 students (18.8%) 


Group VIII, this factor is of very limited pre- 
ic e entire sample is con- 
sidered. 


Discussion 


The negative results fo 
predictor are in agreem 
reported by Stordahl 
contradict the results 
P- 281). The result 


r the I-M scale asa 
ent with the results 
(7) and Powers (5) but 
reported by Strong (8, 
s for ACE and HSR 


Leslie A. King 


are similar to those found by Stordahl (6). 
While Strong (9, p. 91) has found evidence 
to substantiate the expectation that interests 
become more stable with increasing age, the 
results for the investigator’s sample do not 
support the assumption. A nonsignificant re- 
lationship between intelligence and interest 
stability is in contrast with Cisney’s (1) re- 
port for high school students. 

The most Promising of the 12 factors 
studied is the Depth index. Use of the index 
as a predictor is based on the theory that 
consistency and integration of interests are 
necessary conditions for stability and tha 
these conditions can be measured by “ex- 
pected” or “unexpected” patterns. H 
Levy, and Smith (3) found a correlation Si 
— .33 between interest stability as measure 
by rank correlation (rho) and the Depth aa 
dex for a group of high school seniors retest 
two years later when they were college sopho- 
mores, and a correlation of — .37 for a on 
ple of the same group retested four years ags 
when they were college seniors. (The ee 
for the difference between Hoyt, Levy, an 
Smith’s and the investigator’s correlation “a 
efficient signs is that a low rho indicates ia 
relatively unstable profile.) The problem 5 
predicting vocational interest profile pernas 
nence remains unsolved. The investigt g 
Suggests the following problems for fue 
research: (a) the relationship of persone a 
characteristics to interest stability; (b) CUIB 
urement of individuals’ understanding ofS Fe 
item Content; (c) intensive case studies, f 
cluding interviews, of a sample of cases W as 
exhibit little interest change (such factors sa 
perception of interest changes, s occupatio as 
information, work and school exper 
hobbies, health, family pressures, and Mee. 
tional interests of friends could be inv igin 
gated); (d) longitudinal studies of the o” 
and development of interests. 


Summary 


B 
The value of 12 factors in predicting Aes. 
Profile stability of college freshmen Wa was 
vestigated. No significant relationship ores, 
found between stability and age, I-M S€ 
GATB G scores, number of R patterns a 
scores, HSR, socioeconomic status, Ve 


4 


Vocational Interest Profile Stability 


pene, and marital status. A significant rela- 
pee existed between stability and two 
hee ors: number of P patterns and Depth in- 
Ba b Congruence of stated occupational goal 
usiness with a P pattern in SVIB group 
I was also significantly related to stability. 
ee the predictive value of the signifi- 
ang when taken individually or when 
cine “ y combined was of such a low order 
dome ey cannot be used with a reasonable 
i €e of confidence in predicting interest sta- 
ity, Suggestions for further research were 
also made, 


Received October 11, 1957. 


References 


1. Ci 
Cisney, H, N, The stability of the vocational- 
interest profiles of high-school students over 
a two-year period. Papers of the Michigan 
Academy of Science, Arts, and Letters, 1945, 
1947, 31, 309-313. 


263 


2. Edwards, A. M. Population-Comparative occu- 
pation statistics for the U. S., 1870 to 1940. 
Washington: U. S. Government Printing Of- 
fice, 1943. 

3. Hoyt, D. P., Levy, S., & Smith, J. L. A further 
study in the prediction of interest stability. 
J. counsel. Psychol., 1957, 4, 228-233. 

4. King, L. A. Stability measures of Strong Voca- 
tional Interest Blank profiles. J. appl. Psy- 
chol., 1957, 41, 143-147. 

5. Powers, Mable K. A longitudinal study of voca- 
tional interests during the depression years. 
Unpublished doctoral dissertation, Univer. of 
Minnesota, 1954. 

6. Stordahl, K. E. The stability of Strong voca- 
tional interest blank patterns for pre-college 
males. Unpublished doctoral dissertation, Uni- 
ver. of Minnesota, 1953. 

7. Stordahl, K. E. Permanence of interests and in- 
terest maturity. J. appl. Psychol, 1954, 38, 
339-340. 

8. Strong, E. K., Jr. Vocational interests of men 
and women. Stanford Univer. Press, 1943. 

9. Strong, E. K., Jr. Permanence of interest scores 
over 22 years. J. appl. Psychol, 1951, 35, 


89-91. 


al of Applied Psychology 
Pa, ka 4, 1958 


The Significance of Time Spent in Answering Personality 
Inventories 


Arthur R. Yeslin 
Illinois Institute of Technology 


Leroy N. Vernon and Willard A. Kerr 


The Personnel Laboratory 


It was believed that the time spent in an- 
swering personality inventories might have 
significance in the measurement of person- 
ality. Since inventories which ask a subject 
to answer questions about himself admittedly 
produce answers which are influenced by the 
motivation of the subject, a time measure- 
ment would be less liable to bias than the 
trait scores ordinarily derived. It was hy- 
pothesized that individuals who are insecure, 
indecisive, or poorly adjusted at the time of 
applying for a sales job would consume more 
time, relatively, on tests composed of volun- 
tary commitment items which are likely to be 
tension producing than on tests composed of 
problem-solving items. It was also hypothe- 
sized that these individuals would be less suc- 
cessful as salesmen than others. 

Since reaction time (time spent in reacting 
to complex choice questions) and reading 
speed are not the variables with which we are 
concerned, it would be desirable to control 
these variables, In the present study there is 
no independent measure of them but they 
were to some degree controlled. 

Opportunity to test this hypothesis by cor- 
relating time expenditures against an accept- 
able independent criterion came in 1957. A 
leading national electronics sales firm whose 
entire sales organization of 226 men had been 
tested supplied success rankings by regional 
and district sales Managers. These data were 
based on actual sales performance corrected 
by the raters for experience, fertility of re- 
gion, and other aspects of sales Opportunity 
known to the managers. The group included 
the present force of 171 sales engineers and 
55 men who had been terminated because of 
failure to sell. The entire group was ranked, 
giving the 55 separated men all a rank of zero. 

Each of these men had completed a battery 


. d- 
of tests requiring about .7 hours and inclu 


ing the Strong Vocational Interest Blank, the 


Bernreuter Personality Inventory, an pe 
tory of neurotic tendencies, a selling piu 
test (actually an inventory containing ae 
sonality, interest, and values questions A 
Bennett Mechanical Comprehension Tes a 
vocabulary test and The Personnel — 
tory Power Intelligence Test (no time 7 
These tests were presented assembled in bo a 
lets so that the subject went directly from a a 
to another of the tests. He was instructe! k 
record the time at the beginning and the € 
of each questionnaire. i itis 
Two types of statistics were used in ae 
study: (a) The total time spent on the 
(referred to here as absolute time), and 


. ; ctive ` 
a ratio between times spent on intellectiv" 


tests and times spent on inventories. 4 
ratio is believed to produce some degra a 
correction for differences in reading spee intel- 
reaction time since both inventories andi 
lective tests require considerable reading- 


Absolute Time Consumption 


of 

When absolute time consumed on = Q 
these tests is correlated with sales eee 
men in the field, the Flanagan-methe 1 are 
sonian Coefficients (1) shown in Table 
obtained. +30 Goh 

As anticipated, these absolute time sugges 
tions are of negligible magnitude and y Pe 
that if time consumption data have FE y 
dictive significance it has been covere 
uncontrolled factors. 


rela- 


Relative Time Consumed 


ere 

” e es WE” 

Accordingly, raw absolute time te con 

transmuted into ratios, using the E and 
sumed by each examinee on persona 


264 


\ 


ps 


| 


De 


` 


\ 


Answering Personality Inventories 


Table 1 


Cor; i 

relation Between Sales Success and Time Consumed 
on Each of Seven Power Tests 
(N = 221 electronics salesmen) 


Test r 

St i 
ie Vocational Interest Blank —14 
yy reuter Personality Inventory —-05 
i Personality Inventory —07 
as Adjustment Inventory —.02 
oes t Mechanical Comprehension Test 03 
i t Test of Intelligence -00 
cabulary Test 02 


* 
robes Connirae are, statistically significant at the 95% 
In ” 

S st inventories as a numerator, and time 
arious oo tests as a denominator. 
is <“hperninn soe of tests implementing 
able 2 were investigated, as shown in 

mumerate having “intellective” tests as 

or. 
iia Pon of Table 2 reveals statistical sig- 

Umber E, 19 of the 28 coefficients, and a 

Bive ws Bs the figures are large enough to 

antial support to the hypothesis. 


265 


When we investigate various combinations 
of time scores, we find support for the hy- 
pothesis. The ratio 


Vocabulary Test Time + 
Power Intelligence Test Time 


Selling Aptitude Test Time + 
Vocational Interest Blank Time 


correlated with a criterion of sales success, 
y= .42. The largest single coefficient in the 
matrix (— .49) is produced by a ratio which 
has time for sales interest and sales person- 
ality as its numerator and total time for all 
tests as its denominator. 

It should be noted that time for all seven 
tests represents approximately one standard 
day of testing. Since the reliability of test 
response speed behavior probably increases as 
a function of the Spearman-Brown prophecy 
formula, it is not surprising that increased 
reliability of the (whole day’s speed) de- 
nominator should make for a more reliable 
ratio which can, in turn, correlate higher with 
an independent criterion than can a less re- 
liable (fractional day’s speed) ratio. 


Table 2 


Correlations Between Sales Success and Relative Time (Expressed in Ratios) for Cert 


ain Tests 


and Combinations of Tests 
(N = 221 electronics salesmen) 


Ratio Numerator 


Divisor A B (e D E F G 
~y, 
RL, cabulary Test 
Wi 
E Bennetet of Intelligence 08 
E eneral Mechanical Comprehension —.05  —.08 
ro ellin, Adjustment Inventory .18 .08 
a Bema Ptitude Inventory .25 25 i$ 
Sa tong ne Personality Inventory .00 pr r n 
~Totay f cational Interest Blank 37 2 * FD gk 
A+B+C+D+E+F+G 16 16 : : é 
A+B 19 
a= Al 
D+E 
“qo = 
A+B A+B_ 2 
Dee ATE oe 
DTETF +6 = 36 E+G 
for, No 
ara MEE ea 


Tres te 
bea Tela gee 
tabi ations ee 


pere renes in boldface are statistically significan! 
Cer co mae 
is Statistic, by ratios in whic! 


h the denominato! 


r was the total time on 


266 


The best prediction of sales success from a 
single test time ratio was obtained by divid- 
ing the raw time score on the selling aptitude 
test by the total raw time score for all seven 
tests. In keeping with the original hypothe- 
sis, this finding suggests that the individual 
who is indecisive, poorly adjusted or insecure 
with regard to the sales field and who is ap- 
plying for a sales job becomes more emotional 
in responding to these questions and hence 
spends a disproportionate amount of time on 
them. 


Summary 


It was hypothesized that the individual 
with many misgivings and anxieties about his 
suitability and/or Prospects in the sales field 
would spend relatively more time answering 
inventory questions concerned with sales per- 


A. R. Veslin, L. N. Vernon, and W. A. Kerr 


sonality and sales interest than on problem- 
solving questions not directly related to sales. 
It was assumed that such misgivings and 
anxieties would mitigate against his success 
as a salesman. Two hundred and twenty-six 
electronics sales engineers were ranked ac- 
cording to sales success. These rankings were 
correlated with ratios of time spent on inven- 
tories divided by time spent on all tests. A 
number of significant correlations were pro- 
duced, the highest resulting from invention 
measuring sales personality or sales interes 
as related to total time on all tests. 


Received October 29, 1957. 


Reference 


1. Thorndike, R. L. Personnel selection. New York: 
Wiley, 1949. 


£ 


Journal of Applic 
Vol. 42, hat piied Psychology 


Self-Ratings and the EPPS* 


John H. Mann 


New York University 


aan present study is concerned with the re- 

EPPS between the 15 variables which the 

self Purports to measure and a series of 
-ratings on these same variables. 


Method 


ad Ss of the study were 96 graduate students 
Who w at random from approximately 160 students 
aes re attending a course in education. Thirty- 
median the Ss were males; 63 were females. The 
fro age of the Ss was 29; the age range extended 

Th 19 to 54 years. 
to the Allowing three instruments were administered 
cours S at the beginning and at the end of the 
e which they were attending: (a) the EPPS; 
Bies aerating scale of 15 items based on the vari- 
an ide ich the EPPS purports to measure; and (c) 
ables al self-rating scale based on the same 15 vari- 
Scales he rating categories used for these rating 
Variable” derived from the descriptions of the 
PPS M measured by the EPPS as given in the 
On th anual (1); they are reproduced in Table 1. 
e self-rating scale the Ss were asked to rate 


Table 1 


The Relationship of EPPS Variables to 
Self-Rating Categories 


themselves as they actually are; on the ideal self- 
rating scale they were asked to rate themselves as 
they wished they were. The Ss were encouraged to 
answer the questionnaires as honestly as possible. 
They were assured that all test responses would be 
held in strict confidence and would be used for sci- 


entific purposes only, 


Results 


Table 2 presents the reliability coefficients 
obtained from the present data as well as 
those given in the EPPS Manual (1). The 
test-retest reliability coefficients of the self- 
ratings and of the ideal self-ratings are also 
supplied. The reliability coefficients given by 
Edwards for the EPPS are somewhat higher 
than those found in the present study. This 
discrepancy may be due to the difference in 
the interval between test and retest for the 
two sets of data. Edwards reports an inter- 
val of one week between test and retest; the 
study was based on a three-week in- 


present 
It should be noted, however, that 


terval. 


Table 2 


Test-Retest Reliabilities of EPPS Scores, Self-Ratings 
and Ideal Self-Ratings 


E: 
as Variables Self-Rating Categories 
A 
Deters ement Ability to get things done EPPS A 
EÈ bitionis; Interest in the opinions of others A, SE Li 
Order m Standing out in the group Variable Data Data Rating Rating 
Autono Being neat and orderly - 
my Being independent of the opinions of Ahem 74 ‘64 33 pe 
Affiliatio others Deference .78 .87 35 A8 
Tntroce hs Loyalty to others Order 87 17 BS 47 
toran m Interest in the feelings of others Exhibitionism 74 71 66 56 
aaa Dependence on others ’ Autonomy 83 76 49 34 
aseme: = Dominance in social relations ‘Affiliation 77 35 ‘62 35 
Utia Being timid and feeling inferior Introception 86 67 50 46 
ange ice Helping others in trouble z Succorance 78 72 44 31 
Adurance Interest in having novel Gre Dominance .87 KE) .56 38 
terosexnalt Completing tasks that areundertaken Abasement 88 69 44 26 
Bressi uality Interest in the opposite sex Nurturance 79 59 57 33 
en Being aggressive toward others Change 83 86 59 ‘69 
ip Endurance 86 A 2 52 
he i Heterosexuality -85 4 3 50 
Making 2Uthor wi Levin for 
Papen available come ia eS oe ty ped in this Aggression 78 80 Al 5A 


267 


268 


Table 3 


i i { iables and 
tion Coefficients Between EPPS Variab 
i sel Self- and Ideal Self-Ratings 


EPPS EPPS 
and and Ideal 
Variable Self-Rating Self-Rating 

Achievement 2 19 
Deference .28* -04 
Order aie 02 
Exhibitionism .06 —.02 
Autonomy B -18 
Affiliation 00 .00 
Introception .07 .02 
Succorance soot .03 
Dominance .26* 04 
Abasement .39* -10 
Nurturance .34* .26* 
Change 42* 12 
Endurance .41* 10 
Heterosexuality 40* .12 
Aggression .24* ll 


* Significant at the .05 level, 


Klett (2) found in an ind 
the split-half reliability 
EPPS were also somewh, 


lependent study that 
Coefficients of the 
at lower than the 
reported by Ed- 
(1). Even these 
T, are reasonably 
a personality test. 
trelations between 


self-ratings were 
14 were positive in 
one was found to be .00, To find 
this many positive relationships is highly sig- 
nificant since it would occur by chance less 
than once in a thousand times. 

Table 3 also indicates 
15 correlations between 
ideal self-ratings was fo 


that only one of the 
EPPS variables and 
und to be significant. 


John H. Mann 


Since one significant correlation coefficient = 
20 would be expected to occur by chance, t : 
relationship between EPPS scores and a 
self-ratings appears to be negligible. T) a 
finding is not surprising since one componen 
which enters into ideal self-ratings is socia 
desirability, and the EPPS is designed to 
eliminate the component of social desirability 
as a factor in test response. 


Discussion and Summary 


The findings of the present study support 
the conclusions that; (a) the EPPS has a 
factory test-retest reliability; (b) the er 
correlates with self-ratings on the varie 
which it purports to measure; (c) the E z 
does not correlate with ideal self-ratings © 
the variables which it purports to measure. J 

In order to interpret these findings it is a 
portant to note that the categories on b pi 
the Ss rated themselves were arbitrarily a 
lected and formulated from the heme er: 
the variables supplied by Edwards in a 
EPPS Manual. This formulation was w 4 
sarily a crude procedure since Edwards er 
plies a whole paragraph of description he 
each variable, and this paragraph san i 
easily be summarized in a few words to fo a 
the rating category. With this difficulty it 
mind, and considering the range of perne 
interpretation that is likely to occur RT 
different individuals rating themselves on $ is 
highly abbreviated category headings, at i 
indeed surprising that the present res 
were as decisive as they proved to be. 


Received October 25, 1957. 


References 


rence 
1. Edwards, A. L. Edwards Personal P. ae 
Schedule manual. New York: Psyc 
Corp., 1954, dents 
2. Klett, C. J. Performance of high school edule 
on the Edwards Personal Preference 
J. consult. Psychol., 1957, 21, 68-72- 


Journal ; 
Vol, 42, Wa Pplied Psychology 


A Manifest Structure Analysis of the Otis S-A Test of 
Mental Ability, Higher Examination: Form B 


Frank M. 


Montana State 


du Mas 


University 


and King MacBride 


Michigan State University 


aa pumos of this paper is to show how 
sis can eo called manifest structure analy- 
ests e used to shorten a standardized test. 

are i emg a by conventional methods 
Measuri n considered to be well-developed 
compos = instruments. They are usually 
time limit of many items with a fairly long 
tions tien There arises in practical situa- 
accurate i for tests which are sufficiently 
Same ti or certain requirements and at the 
time, T not too demanding of a testee’s 
Stantly hese practical considerations con- 
Pers arise in the selection and evaluation 
orm i pany in business and industry. One 
aa pa the Otis S-A Test of Mental Ability 
er to n shortened by Wonderlic (3) in or- 
anq aces the practical needs of business 
0-ite strial organizations. Wonderlic built 
Whic = test out of the longer Otis Test, 
e a 12 instead of 30 minutes of 
assica] pia time. His approach was to apply 
already est methods to further reduce a test 
Sid, ee by means of classical test 
developed This shortened form of the fully 
Mdustr test is widely used in business and 
Need i Such shortened tests satisfy a real 
The ace practical situations. 
Sis (4 ) ew method, manifest structure analy- 
Purpose Was not developed specifically for the 
8eneray of shortening a test. It is 4 quite 
$ lennas or test theory and method. The 
Sefu] ng of a test is simply one of the many 
Manifest plications of the method. Since 
Scale th structure analysis is essentially a 
other cca approach to item analysis and 
sical te asurement problems, and since clas- 
formul St construction methods are not we- 
desirat a scaling methods, it was thought 
I Scaling to apply the more severe criteria of 
l 8 procedure to a reputable psycho- 


Bic 
al 
test. This was done in order to see 
269 


cl; 


whether or not a scale could be extracted 
from a set of items already analyzed by clas- 


sical test methods. 


Procedure 


Criterion 

The criterion selected for prediction was 
the total score made on the Otis S-A Test of 
Mental Ability, Higher Examination: Form B 
(2). The total score was derived under the 
condition of the 30-minute limit rather than 


the 20-minute time limit. 


Method 


The method used for the anal, 
items on the full scale was a 
manifest structure analysis (1). 
ory and method, with step-by- 
are given in the refer 


lysis of responses to 
new method called 
The complete the- 
-step computational 
ence. The general 


examples, 

model of manifest structure analysis is shown in 

Fig. 1. The particular scale model that we at- 
the intensive model 


t empirically was 
analysis as shown, for the spe- 
The intensive 


tempted to fi 
of manifest structure 2 
cial case of a 10-item test, in Fig. 2. 


Response Patterns 


a 
E 
a A 
2 6 
d e 
2 
This is the general model of manifest 


Fic. 1. 
structure analysis. Note that columns represent re- 


sponse patterns, not items. A number in the matrix 
is the probability that a response pattern will occur 
at a specified Jevel, rank or score. 


270 


ITEMS 

12 k 6 g 20 
10 n r e ee a E a E 
Smet me XK x x x x 
| ERC ae a ee E" 
c 
Ee o x eX & 
ny 
TG CGS TF ee Some tues 
E 
RS g OMI, ORS E g 
E 
CW x E x x 
N 

Smee X s 

ARE e 

A 4 

— a _ EEE 
Fic. 2. 


This is a special case of the intensive 
model with 10 possible values of a criterion and 10 
items. In this study, rows represent total scores on 
the long form of the Otis and columns represent 
scalable items. An x in this matrix represents the 
probability value, 1, that an individual who makes 
a specified score will answer certain items correctly, 
cells without an x represent a probability of zero, 


model is roughly equivalent to a cumulative scale 


and is simply a special case of the general model 
shown in Fig, 1. 


correctly, The sı 
was used in this 


each 
item is calculated from the formula 
2T 
V= T’ Oyj 


where 


V = a category scale value, 


ET = the sum of the total Scores made on the long 


form of the Otis S-A Test by every individ- 
ual answering that item Correctly, 
N = the number of individuals answering the item 
correctly, 
On the basis of the reg 
tion sample, a group 
an empirical analogue 


ponses made by the standardiza- 
of items was selected which was 
of the intensive model shown in 


Frank M. du Mas and King MacBride 


Fig. 2. A score on the short form was calculated by 
means of 


[2] 


where 


EV = the sum of the category scale values of those 
items that an individual answers correctly, 
n = the number of items he answered correctly, 
S = the individual's score on the shortened test. 


m a 
In order to predict a score on the longer her 
score on the shorter test, a correction must 
This correction is 


S' = mS +k, (3] 


where 


S’ = the weighted score predicted from the score 
made on the shortened test, 
m = the slope constant, 
S = the score on the shortened test, 
k = the ordinate intercept. 


le 

In this way the prediction of the most probra 
score an individual would make on the longer 
possible from the score made on the shorter sample 

Sample 1. This was the standardization nses t0 
composed of 70 college students. Their respo 230 
the 75 items of the Otis yielded for la. 
Possible single item responses and Bio deen com- 
sible response patterns. In any time limit tes diffi- 
posed of items arranged in order of increasing ciat 
culty, the operating characteristic of an e because 
both item difficulty and item inaccessability of 
some of the slower Ss never get to try T long 
more difficult items. Both the short and t ` two 
forms of the Otis contain and confound ihe under 
sources of variance since they were derive 
the condition of a time limit, h model 

The intensive model was used as our searc! one 
for any possible ordered structure existing PA in ex- 
response patterns. The total time require scale 
tracting the 20 scalable items was 18 minutes. scores, 
values were calculated for each item. Bae 
S and S’, were calculated for each individua prosi- 
testing time required for the short form ya PTputes: 
mately 20/75 of 30 minutes, that is, eight he cross” 

Sample 2. This was the sample used in t as com 
validation of the short form. This sample Fah the 
posed of 39 additional students who took Jues f0 
long form and the short form. The scale ons ona 
items had already been derived from tn re- 
These scale values were used to weight ine Gale i 
SPonses to the short form made by inde ate? 
Sample 2. The Scores, S and S’, were then Ca 


Results ms 

jte! 
Standardization (Sample 1). DA Jong 
were selected from the 75 items in t n eñ 
form. These items, which formed & 


rx 


Otis S-A Test of Mental Ability 271 


Table 1 


The Twenty-Item Test Obtained by Means of a 
Manifest Structure Analysis of the Otis 


Items Scale Values 

1 54.43 
47 55.52 
55 55.75 

4 55.80 
60 55.80 
18 56,00 
30 56.20 
43 56.47 
53 56.59 
67 56.80 
63 57.02 
65 57.56 
62 57.64 
45 58.31 
69 58.32 
58 58.46 
68 58.63 
73 60.00 
74 60.08 
71 60.26 


N 5 5 

nate Hearne first column contains the numbers which desig- 

contaitems in the long form of the Otis. , The second column 
S the scale value for each item in the first column. 


listoa analogue of the intensive model, are 
seal in the first column of Table 1. The 
ond. value of each item is shown in the sec- 
Scor column of Table 1. Each individual's 
[2] e, S, was calculated by means of Formula 
the ; Scores on the long form and scores on 
Pa port form were plotted and a linear func- 
squ fitted to the data by the method of least 
We eae The parameters of Equation [3] 
te estimated to be 
S’ = 10,068 — 513.31 [4] 
fy Predicted score, S’, was then calculated 
€ach individual by means of Formula [4]- 
Tong Correlation between total score om the 
8 form and the predicted score OP the 


s 
hort form was Trg = .72. The ¢ test of the 


hull z 
Cp o Pothesis resulted in p < 01. 


8s-validation (Sample 2). The scores 
a on the long TS d a on the short 
this <, “cre obtained for each individual in 
caleua PE also. Formula [4] was used to 
ate the predicted scores, 5’ The cor- 


relation between total score on the long form 
and the predicted score on the short form 
was rpg = .82. The ¢ test of the null hy- 
pothesis resulted in p < .01. 


Discussion 


Some clarification might be desirable re- 
garding certain operations performed in a 
manifest structure analysis of the kind that 
occurred in this study. 

The question may have arisen as to just 
why the weighted score, S’, is used when the 
unweighted score, S, is already available. In 
the operations of manifest structure analysis 
the unweighted score, S, has a much smaller 
range than the manifest or criterion variable. 
The unweighted score is, however, linearly re- 
lated to the manifest variable. By weighting 
the unweighted scores, as indicated in For- 
mula [3], the prediction of the manifest vari- 
able is attained. The weighted scores, S, 
then have a range closely approximating that 
of the manifest variable. In this instance, 
this permits us to obtain from the short form 
of the Otis the score an individual most prob- 
ably would have made on the long form of the 
Otis. The correlation between the long form 
and the weighted scores obtained on the short 
form, rrs’, Was calculated in order to obtain 
an estimate of the concurrent validity of the 
shortened test. Except for rounding errors, 
the correlation 77s’ should be identical to the 
correlation between the total and the un- 
weighted scores, Yrs: 

The shortening of the long form of a test 
by means of manifest structure analysis has 
certain practical advantages over the statisti- 
cal methods employed in conventional test 
analysis. The use of a mechanical device 
called the scaling frame (1) greatly reduces 
the time and cost of the item analysis. For 
example, the scalable items shown in Table 1 
were analyzed and extracted in 18 minutes. 

The intensive catescale is similar to the 
structural response model of Loevinger’s ho- 
mogeneous tests. In both of these methods 
of analysis one attempts to relate the diffi- 
culty of an item to a particular Jevel of abil- 
ity. In this regard, they are somewhat simi- 
lar to the item ogive of classical test methods 
relating the proportion passing the item to 


272 


degree of ability. The intensive model is, 
however, different from both Loevinger’s ap- 
proach and the conventional one. In these 
two methods the item difficulty is computed 
from the proportion Passing or failing the 
item, and the trait continuum is inferred from 
the content of the item. In the intensive 
catescale of manifest structure analysis, the 
scale value of an item, which is roughly 
equivalent to item difficulty, is calculated by 
reference to the manifest variable or cri- 
terion. That is, the scale value and the trait 
dimension are both Operationally defined in 
terms of the relationship between an item 
and a manifest criterion, 

Other things being equal, any method that 
derives actual scale value for items should be 
superior to the conventional method of scor- 
ing—as used in the long form of the Otis— 
which simply sums the number of items got 
tight. This is due to the fact that when one 
gives one point for each item got right the 
implicit and usually fallacious assumption is 
made that the trait distances between con- 
secutive terms are all equal. When a score is 
calculated on the basis of scale values, as in 
manifest structure analysis, the trait distance 
between consecutive items is specified with 
greater precision. It is obvious that from the 
long form of the Otis, which contains 75 
items, one should be able to select far fewer 
items spaced over the entire range of difficulty 
in such a way as to accomplish measurement 


of a trait. The Correlation, ros’, is a form of 


concurrent validity. The two Coefficients ob- 
tained for the standardized and cross-valida- 


tion samples were a -72 and .82 respectively. 

In the present study, Item 1 was found to 
be the least difficult and Item 71 the most 
difficult of all items selected from the long 
form of the Otis. The difference between 
these items’ scale values is 5.83 in units de- 
rived from the total score on the long form 


Frank M. du Mas and King MacBride 


of the Otis. This apparent restriction in 
range is not, however, as important as it may 
at first seem to be. All calculations were car- 
ried to five and rounded to four significant 
digits. By multiplying the value 5.83 by a 
constant, say 100, one gets 583. There is 
actually a range of 583 countable units avail- 
able for prediction by means of the 20-item 
short form of the Otis. Equation [4] is essen- 
tially that linear transformation which trans- 
forms the range of values obtained on the 
short form into the range of values obtained 
on the long form. 


Conclusions 


1. A new method, Manifest Structure pul 
sis, was found applicable to the problem a 
shortening a test previously well-analyzed an 
constructed by means of classical test meth- 
ods. f 

2. A short test was developed composed 0 
20 items of the 75-item Otis S-A Test of Men- 
tal Ability, Higher Examination: Form P 
The testing time for the long form was 3 
minutes. The testing time for the short form 
was eight minutes. 

3. The correlation between scores on the 
long and short forms was .72 for the seir 
ardization sample. Cross-validation mena 
in a correlation of .82 between the scores 09- 
tained on the long and short forms. 


Received November 25, 1957. 


References 


1. du Mas, F, M, Manifest structure analysis. Mis 
soula: Montana State Univer. Press, D 
2. Otis, A. S. Otis Self-Administering Tests of X 
tal Ability, Higher Examination: Form 
New York: World Book, 1922. ison- 
3. Wonderlic, E. F, & Hovland, C. I. The pe the 
nel test; a restandardized abridgement oj A 
Otis S-A Test for business and industria 
J. appl. Psychol., 1939, 23, 685-702. 


‘al 


Journal of A 5 
Vol. 42, wa pred Bsvchology 


Effects of Surface Friction on Skilled Performance with 
Bare and Gloved Hands + 


Hilde Groth and John Lyman 


University of California, Los Angeles 


of pa common knowledge that circumstances 
ful vironment such as temperature or harm- 
chemicals may require that protective 
tea coverings be worn while skilled manual 
liters are performed. Although the available 
t rature in this field is not large, there seems 
the fo agreement that all gloves, even 
m innest surgeon’s gloves, lead to some 
€asurable performance decrement (1, 2, 3, 
fee In previous studies, this decrement 
speed een defined as losses in performance 
facto: and quality. One of the important 
rs contributing to dexterity decrement 
a assumed to be interference of the 
sion covering material with normal transmis- 
Of cutaneous cues (6, 8). 
ilot studies in this laboratory suggested 
consign anges in surface friction might be of 
ini ob erable importance as a physical factor 
Pend served skill decrements, perhaps inde- 
n ae of effects on cutaneous sensitivity. 
Ureme ifferent context we found that a meas- 
ion a of prehension force during manipula- 
index ould be considered to be a fairly good 
low ie effort during manual performance at 
ae of energy expenditure (4). Since 
Close] n and required prehension force are 
aed related on purely physical grounds it 
fii to attempt to evaluate = 
Criteria. ip in terms of various performan 


This į a ; 
his investigation was designed to assess 


eetimentally the effects of surface friction 
fie speed of performance and rate of 
Was hi for a simple manipulation task. It 
these ped that simultaneous measurement of 
With dependent variables would provide us 
information as to which aspects of a 


Tite. g 

i investigation was supported by QM Core 
pray Qh A 44-109-9M-1531 between the U. S. 
An Corps and the University of California, 
author The opinions expressed are those of 
fes ntaa and do not necessarily reflect those © 


Tacting agency. 
273 


task would be most adversely affected by non- 
optimal conditions of surface friction. 

The specific hypotheses tested in this study 
were as follows: (a) Prehension force exerted 
during a task is inversely related to the size 
of the coefficient of friction between the sur- 
face of the handcovering and the manipulated 
object. (b) Speed of performance remains 
stable over some range of friction and is in- 
versely related to extremely low coefficients 
of friction. (c) Rate of output is less sensi- 
tive to changes in friction but also shows an 


inverse relationship. 


Procedure 


Twelve right-handed male engineering 
d from undergraduate classes. 
Figure 1 shows the ma- 
nipulation apparatus used. The electronically-con- 
trolled manipulation apparatus has been described in 
detail elsewhere (4). Its main components were a 
simple formboard, a light bank display panel and a 
split aluminum cylinder instrumented with strain 
gages. Both display and control panels consisted of 


Subjects. 
students were recruite 
Apparatus and tasks. 


Fic. 1. Manipulation apparatus. 


274. 


a 3 X 6 matrix arrangement placed on a table 30 in. 
high. Weight of the aluminum cylinder was 122 gms. 
Task performance required only the simple motion 
elements of grasp, transport and release. Instru- 
mentation permitted recording of (a) the integral 
of prehension force applied to the cylinder during 
manipulation, (b) the sum of the transport times for 
each individual movement for the duration of the 
task, (c) the sum of the cylinder transports. Ear- 
phones were worn by all Ss in an attempt to control 
the aural environment. The output of a random 
noise generator was adjusted to 65 db SPL and used 
as a masking noise, 

The task was self-paced. The cylinder had to be 
placed into that hole of the formboard which corre- 
sponded to the lighted circle on the display matrix. 
The display would change to the next position as 
soon as the contact of the preceding move was made, 
The order of the lighting sequence appeared random 
to the Ss. 

Changes in surface friction were produced by ap- 
plication of a coat of wax-benzene paste or of a 
silicone release agent (Dow Corning 7 Compound) 
to the finger tips of the bare hand and to the finger 
tips of an army leather glove (Glove, Shell, Leather, 
M-1949). By this method we hoped to obtain simi- 


lar frictional conditions for bare hand and glove 
performance, 


MEAN PREHENSION FORCE (GMS) 


4 5 6 7 


Fic. 2. Relationship betw 


DUENA 


3 Ar 


COEFFICIENT o 


Hilde Groth and John Lyman 


The coefficient of friction between each treatment 


condition and aluminum was determined by the drag 
method in which the mean force at the “just slip 
point over the horizontal surface is measured by 
spring scales. The » was calculated according to the 
formula, 
mean “just slip” force 
= weight of object 


Routine. The independent variables were Wee 
bined into six treatment conditions, each of three 
minutes duration: 


1. Bare hand, wiped with alcohol («= 1.53) 

2. Bare hand, coated with wax-benzene paste (# 
ead =.14) 

- Bare hand, coated with silicone grease (u = -1 

. Leather glove, untreated (u = .41) ü 

- Leather glove coated with wax-benzene paste 
(n= 65) a 

6. Leather glove coated with silicone grease (# = 
-14) 


apo 


Each S was thoroughly familiarized with the ar 
by E, then given a three-minute practice trial wi! 
bare, untreated hands. Treatments 1 to 6 were ue 
domly administered in a subject-by-treatment desig 
The Ss were standing for 


during a single session. 


A 


` (es. ja te NA 1b 
F FRICTION 


ta) 


een ji icti 
coefficient of friction and prehension force. 


ned 


Performance with Bare and Gloved Hands 


Be nental conditions. During the 5-min. rest 
noe etwen any two treatments, the Ss sat down 
5 riction coatings were applied by E. 

E S eset was conducted during the first 
fwee od lee: 1957. Room temperature remained be- 
Calcul and 27° C throughout the experiment. 
Re: The effects of surface friction on 
RE ion force were assessed by analysis of vari- 
Rei ao (10). Heterogeneity of variance in- 
ices bet e use of nonparametric statistics. Differ- 
hikers Ween any two treatments were evaluated by 

Teas rank test (10). 
ee of surface friction on time per trans- 
analysis number of transports were assessed by 
ote of variance (7). Differences between any 
Th atments were evaluated by £ tests. 
significance level was set at P < .01. 


Results * 


a relationship of friction to effort. Criterion 
Bina 1 mean prehension force (PF), ob- 
ee by dividing the integral of force by 
al transport time. 
Bie Shows a monotonic increase with a de- 
i. e in the coefficient of friction. Figure 2 
ship graphical representation of this relation- 
Pat ra analyses led to the following 
A ee the bare hand conditions, there was 
treat = cant increase in PF between any two 
ents showing a decrease in friction. 
ila same results were obtained for the 
Conditions. 
frier Parison of treatments with similar 
and n, i.e., glove coated with wax with bare 
greas ted with wax, and glove with silicone 
ase with bare hand with silicone grease did 


ot Neer A 
the pow any significant difference between 


© consistent relationship between perform- 
posited with 
Order Docu- 
Publications 


2 Th 
the 7 Statistical tables have been de 
Ment pican Documentation Institute. 


Proje? 5594 from A.D.I. Auxiliary 
gress Wo botoduplication Service, Library of Con- 
$1.25 ¢/ashington 25, D. C, remitting in a hae 

‘ak 


5 
checka Y Microfilm or $1.25 for photocopies- 
Libra Payable to Chief, hotoduplication Service, 
*Y 0: ongress, ; 


275 
Table 1 
Mean Prehension Forces and Variabilities 
5 Coefficient 
Exp. Se PF of Friction 
Condition X (gms.) s(gms.) m 

Bare, untreated 486.69 294.2 .73 
Bare, alcohol 383.23 196.2 1.53 
Bare and wax 515.56 236.2 68 
Bare and silicone 2219.33 1159.3 14 
Leather glove 1154.50 568.0 Al 
Glove and wax 575.93 337.9 65 
Glove and silicone 1917.03 878.1 14 


ance speed and the coefficient of friction was 
found. The mean values and variabilities are 
summarized in Table 2. 

Comparison of treatments with similar sur- 
face friction led to equivocal results. Per- 
formance with the bare hand coated with wax 
was faster than with the corresponding glove 
condition. However, for the silicone coatings 
the trend was reversed and performance with 
the coated glove was faster. 

Relationship of friction to rate of output. 
Criterion measure: total number of transports 
during the three-minute test trials. 

The statistical analysis indicated that only 
extremely low surface friction as obtained by 
silicone grease application depressed the out- 
put rate consistently. The mean values and 
the variability of the output rate are summa- 
rized in Table 3. 

Comparison of treatments with similar sur- 
face friction indicated a superior performance 
for the bare hand coated with wax but failed 


Table 2 
Mean Times Per Transport and Variabilities 
Exp. T/E T/tr 
Condition X (sec.) s (sec.) 
Bare, untreated 85 .20 
Bare, alcohol 72 13 
Bare and wax .70 12 
Bare and silicone 78 18 
Leather glove 72 AS 
Glove and wax 73 13 
75) 12 


Glove and silicone 


276 


Table 3 


Mean Number of Transports and Variabilities 


Exp. No. of tr. No. of tr. 
Condition x s 
Bare, untreated 122.2 33.6 
Bare, alcohol 146.3 25.0 
Bare and wax 150.2 26.5 
Bare and silicone 123.5 36.5 
Leather glove 139.9 31.8 
Glove and wax 140.2 27.3 
Glove and silicone 130.1 23.8 


to show a significant difference between the 
two silicone conditions. 


Discussion 


Relating the results of this study to our 
hypotheses, we find that only the first hy- 
pothesis postulating the relationship between 
surface friction and prehension force has been 
supported. Furthermore, changes in prehen- 
sion force were apparently unaffected by any 
effects the handcovering had on cutaneous 
sensitivity. 

If we assume PF to be a useful index of 
effort and that there is a close relation be- 
tween the amount of effort exerted on a task 
and the time of onset of fatigue, the experi- 
mental results have additional significance., 
The present results indicate that speed and 
output rate on a manipulation task may be 
kept within narrow tolerance limits despite 
adverse conditions of surface friction, though 
the “physiological cost” requirements can be 
considered to have risen considerably. We 
would like to emphasize that our task did not 
require a high degree of manual dexterity and 
this may partly account for the failure to find 
consistent changes in performance speed as 
well as the lack of performance decrement 
attributable to distortions of sensory cues, 
The importance of surface friction on a task 
of long duration and its effects upon learning 
and fatigue on the three criterion measures 
cannot be answered from this study. How- 
ever, we feel that the results of this investiga- 
tion have rather unequivocally pointed out 
the importance of surface friction as a physi- 
cal variable for some aspects of manipulatory 
performance. That this variable should re- 


Hilde Groth and John Lyman 


ceive adequate attention in designing piora 
tive handcoverings for optimal performanc 
seems strongly indicated. 


Summary 


The major purpose of this study was to 
assess the effects of surface friction upon 
three criterion measures of manine 
performance: (a) prehension force, (b) be: 
per transport, (c) total number of aie ag a 
These measurements were considered as i : 
dices of the following aspects of vig 
a) effort, (b) speed, (c) output rate. 
ie tried aeania least partially the is 
fects of friction from other factors bee 
handcoverings and the problem of lack ki 
cutaneous sensitivity. Changes in age 
were produced by application of either a ae 
of wax-benzene paste or of silicone puar pi 
the bare finger tips and to the tips 0 j 
leather glove. sy etl 

It was hypothesized that a decrease in A 
tion would increase the amount of effort, z 
tard the speed of performance and decre 
the output rate. peT. 

Twelve Ss performed a simple manip a 
tion task which required discrete moveme A 
of an instrumented aluminum cylinder on 
formboard. snabi 

The results indicated a close relation ae 
between decrease of surface friction and 5 
crease of prehension force. The enea Ẹ 
friction on time per transport remaine a 
scure and the total number of transporta ia 
creased only at extremely low values 0 
Coefficient of friction. 


Received December 9, 1957. 


References 


5 
z stress? 
1. Aiken, E. G. Combined environmental b 


s. 
and manual dexterity, Army Med. Re 
Rep. No. 225, 1936, ol OP” 
2. Bradley, J. V. Effect of gloves on am, 2. 
eration time. WADC Tech. Rep. sib dex 
3. Debons, A. Gloves as a factor in PSA No. 21 
terity. Arctic Aeromed. Lab. Projec 
01-018, 1950. nt 2 
4. Groth, Hilde. An experimental Pre » 
Prehension force as a measure of toral di5“ 
Psychomotor skills, Unpublished doc! Di 


pgele® 
sertation, Univer, of California, Los 
1957, 


£ 
t 


4 


s] 


Performance with Bare and Gloved Hands 


Š. Lindquist, E. F. Design and analysis of experi- 
ments in psychology and education. Hough- 
ton-Mifflin, Boston, 1953. 

6. Lyman, John, Final report on studies of some 
variables relating handcovering design to 
manual performance in extreme environments, 
Univer. of California, Los Angeles, Dep. Engi- 
neering, Rep. No. 56-7, 1956. 

7. McNemar, Q. Psychological statistics. 
Wiley, New York, 1949. 


John 


277 


8. Teichner, W. H., & Zigler, M. J. A method of 
studying the tactual-kinesthetic sensitivity of 
the hand. QM. Res. and Developm. Center, 
Env. Protect. Div., Rep. No. 224, 1953. 

9. Teichner, W. H., Kobrick, J. L., & Dusek, E. R. 
Studies of manual dexterity: I. Methodologi- 
cal studies, QM. Res. and Developm. Center, 
Env. Protect. Div., Tech. Rep. No. EP-3, 1954. 

10. Walker, Helen M., & Lev. J. Statistical inference, 
Holt, New York, 1953. 


al of Applied Psychology 
ae kork 1958 


The Identification of Job Activities Associated with Age 
Differences in the Engineering Industry 


S. Griew and W. A. Tucker? 
University of Bristol, England 


A previous study has demonstrated the 
high degree of stability which exists in differ- 
ences of age distribution between jobs in the 
engineering industry, both over a period of 
time and over a range of firms and areas (7, 
8). The results of this investigation suggest 
that a substantial proportion of men as they 
reach middle age must leave those jobs which 
are usually manned by younger workers and 
that certain features of these jobs probably 
make them unsuitable for older workers, If 
ways could be suggested of modifying jobs in 
which these features are present, some of the 
wastage due to migration from “young” jobs 
might be eliminated. 

There seem to be two approaches to the 
problem of suggesting modifications. First, 
by paying attention to the results of experi- 
mental studies of performance changes asso- 
ciated with age (e.g. 2, 10), one could base 
recommendations upon known changes occur- 
ring with age, applying laboratory findings 
directly to the work situation. It is doubt- 
ful, however, whether sufficient is yet known 
about the relationship between age and total 
job behavior to justify this approach. 

The second possible approach involves the 
preliminary study of jobs themselves, in order 
to identify those features which may be criti- 
cal in the effective performance of older 
workers. 

The purpose of this Paper is to r 
attempt, involving job study, 
identify job activities which ar 
differ among younger and older 
which may. be taken to represent areas in 
which, after more detailed investigation, 
modifications are likely to prove effective, 


eport an 
to broadly 
e likely to 
workers and 


1W. A. Tucker is now with the Wool (and Allied) 
Textile Employers’ Council Work Study Centre 
Bradford, England. The authors are indebted to the 
Nuffield Foundation of London, which financially 
supported this project, They are 


also grateful t 
K. F. H. Murrell for comment and shggestions 3 


Method 


Before undertaking the job studies, two methodo- 
logical issues required attention, In the first placu 
upon what basis should jobs be selected for suay 
Secondly, what sort of information should be oa 
lected about them? After consideration of the firs! 
of these issues, the obvious approach of comparing 
the contents of jobs manned by younger workers 
with those manned by older workers appeared dan- 
gerous and the results likely to prove misleading. 

Since the now classical study of secretaries (4); 
the dangers inherent in the use of job titles h 
been emphasized repeatedly. It is misleading 5 
treat job titles as if they embraced a complex of oa 
tivities and requirements all of which are to a 
found to the same degree in the work of each na 
son assuming the title. In previous industrial stu a 
in the field of ageing (10), the caution required b 
fore it could reliably be assumed that persons pe 
nominally the same job were doing exactly the ais 
work was recognized. Variations of activity whe 
appear, on the surface, to be unimportant DA 
vital in the study of special groups such as ©) ly 
workers. Although this point is not esper aii 
stressed, it is probably largely responsible for Bree 
(3) and Hanman’s (6) pleas for intensive, comp 
hensive and accurate job analyses in the case of 
abled workers, ork 

It was decided instead to examine the actual w 
engaged in normally by a group of younger an ed 
group of older workers, In this way, it was boR 
that critical differences in work content woul 
displayed. The two groups were distingui E 
sharply by age, in order to contrast them as m in 
as possible, and to reduce possibilities of ovale 
“effective” age? The two groups were ran rove 
drawn from two Populations of workers emp. ro” 
in a firm manufacturing piston and eee 
aero engines. The two populations were TO engi- 
matched for such things as age of entry into ation 
neering, domestic circumstances, type of e ae ei 
and minimum length of service (three years) cat of 
jobs at the time of the study. Minimum leng“ pe 
Service was introduced in order to incre a y 
chances of selecting for examination men Wh® 
mples 
»_ the original approac we nave 


men 
zen tantamount to studying the work done bY $ of 
of different ages, except tht the clear separation In 
the age groups would not have been poss! ve bec? 
analysis of data would beie 
and interpretation more diffic' 


278 


i Se a N Y, 


sa forma 
S¥ste l scope of psychology. At the 


Identification of Job Activities 


Presi ae retention in their jobs, could be con- 

ii at least average effectiveness. The two 

se a were aged 24 to 30 years, and 48-61 

lected a included occupants of 10 specially se- 

SS of 3 S, These, which were selected on the ba- 

ie bana being reasonably comparable in terms of 
ic activities they involve, were: 


Borers (v:2 0:1) 
Capstan Operators (¥:2 0: 1) 
Drillers (v¥:3 0:5) 
Fitters (¥:5 0:7) 
Grinders (v:7 0:4) 
Inspectors Œ:5 0: 7) 
Instrument Makers (Y: 6 ) 
Millers ` 5 0:6) 
Polishers (v:2 0:3) 
Turners (v:5 0:8) 


SA figures in brackets refer to the number of 
he RG (Y) and older (O) occupants of jobs in 
roll ae samples which were studied. The nominal 
Names, each population contained approximately 850 
ing fo, and random sampling was achieved by select- 
es r study every twentieth name appearing on 
€ alphabetical lists. 
aa process produced a younger group 
ing ae an older group of 46 workers. Before start- 
these a analysis, the “satisfactoriness” of each of 
lepart, workers was checked with the firm’s labour 
Teaso; ment, as an additional measure to ensure that 
ang Vably effective men were, in fact, being studied, 
ag personal cooperation of each man was 
ers of cooperation was given freely, and all mem- 
of at Je oth groups were passed by the firm as being 
e ast average effectiveness in their jobs. 
collected ain issue of what information should be 
Questi about the work of each man raised many 
sis, ons germane to the whole practice of job analy- 
ing job 2, writers have stressed the dangers of rat- 
in rece content subjectively, and at least one study 
Consiste years has clearly demonstrated the lack of 
(9) ency which may exist between job analyzers 
the ome discussion of job analysis issues is outside 
Durpose of this paper, but it may be noted that the 
vention: s: this study was such that most o 
jobs, al methods of obtaining information about 
Eo ed as they are to the establishment of se- 
Basica. rere and the like, seemed inapplicable. 
Ying j Y, what was rquired was & system of classi- 
pols activities which reflects a model of the hu- 
cal ae erator which is both legitimate on psychologi- 
Content? yet comprehensive enough to cover jo 
f S which may have implications outside the 
ro same time, this 
h to allow the 
Jutely objective 


ers, of 42 work- 


m 

recordin 01d be well defined enoug! 
Tanner = Of Job activities in an abso 

t 
trating decided to tackle the problem by concen- 
iena those wide areas of activity which ac- 
and sai most of the work covered by the study, 
ich could be objectively recorded by timing, 


279 


with the aid of a stop watch, total work cycles, and 
the time spent in the various activities of which it 
was composed, and by noting the frequency of ac- 
tivities during the work cycle. 

For purposes of indicating the scope of the in- 
vestigation, a list of areas which were considered 
follows: 

Lifting: weight, height, distance, frequency 

Posture: standing, sitting, walking, stooping 

Bodily Activity: hands/arms, trunk, legs 

Controls: total number, number used, frequency 

used, relationship with displays 

Displays: total number, number used, frequency 

used 

Instruments Used 

Instructions Followed 

Tolerances Worked to 

Work Cycle Data: total length, and, in the case of 

machinists, proportion of cycle time spent fitting 
and removing components, setting, machining, 
checking, on “automatic” 

Working Speed 

Degree of “Visual Attention” Required 

Degree of “Perceptual-Motor Co-ordination” Re- 

quired 

Work “Finish” 

In actual practice it proved i 
the last two items objectively, owing probably to 
the inadequacy of definition. These were conse- 
quently disregarded in conducting analyses as find- 
ings about them would probably have been mis- 
leading. 

Each member of each group was observed at 
work for between one to four or five days, accord- 
ing to the length of his work cycle. Information 
about the contents of his work was recorded on a 
specially prepared form, only after the analyzer was 
certain that variations (which turned out to be small 
enough in all cases except one to be negligible) were 
noted. Information was then transferred to “master 
sheets,” and the presence and extent of each feature 
in the two groups were computed, and statistical 
ns between the groups undertaken. 


impossible to record 


compariso: 
Results 


In presenting the results of this study, 
we will concentrate upon those job features 
in which significant differences between the 
groups were displayed. Certain features 
showed no difference; this may have been 
due to the fact that these features are not 
critical in relation to age, as well as to the 
probable lack of sensitivity of the methods 
used. The features which failed to show sig- 
nificant age differences were those concerned 
with the degree of lifting involved, bodily ac- 
tivity (which very nearly showed a signifi- 
cant difference), the instruments used, the in- 


280 


structions followed, the tolerances worked to, 
the speed of working, and, as previously 
stated, “perceptual-motor co-ordination” and 
work “finish.” The two points of greatest 
interest in this catalogue of negative results 
are, first, the absence of speed as a factor 
critical in the employment of older workers, 
and, secondly, the apparent unimportance of 
“physical effort” in the form of lifting. 

Previous workers, notably Belbin (1) and 
Welford (10), have emphasized the critical 
nature of certain forms of pacing and speed 
stress, and it is not surprising that this study 
did not confirm these previous findings as, in 
the factory in which this study was under- 
taken, all working speeds were essentially un- 
der the control of individual operators. 

The apparent unimportance of “Dhysical 
effort” is also in line with Belbin’s (1) previ- 
ous findings. When heavy work is not com- 
bined with certain other factors it seems that 
it does not assume critical proportions. One 
of the most important additional factors com- 
bining with heavy work to Provide difficulty 
is pacing, and this was not present. It is rea- 
sonable, therefore, that lifting should not have 
appared as critical. By the same token, other 
factors might have assumed critical propor- 
tions had pacing been Present, and this should 


not be overlooked in considering the negative 
results, 


The followin; 
differences bet 
groups: 

Stooping. A record 
workers who worke: 
their time stooping 
approximately 30° 
42 younger worker 
tions of stooping s: 
of the 46 older wi 


g features showed significant 
ween the younger and older 


was made of all those 
d for more than 50% of 
at an angle of more than 
from the vertical, Of the 
S, 18 worked under condi- 
atisfying this criterion, and 
orkers, only six were found 


Table 1 
The Use of Controls 


S. Griew and W. A. Tucker 


Table 2 
The Use of Displays 


Proportion of Used/Total Displays 


<.40 40-.60 >.60 
7 
Younger Workers 1 9 Š 
Older Workers 13 6 


to stoop to this extent. The difference be- 


x cond i? 
tween these proportions, .30, was significan 


at p= < 0l. 

a Taking the number of coita 
present on a machine, and expressing ad 
actual number used during work as a a 
portion of the total, a contingency table pis 
constructed, in which cell entries refer to a 
number of cases observed to fall into the a i 
gories indicated. This is shown in Tab A 
From these data a x? of 7.23 was calculat a 
which, with df = 2, is significant at p = ‘a 
05. The proportion tended to be greater 
the case of older workers, =. was ABE 

Displays, Taking the same ratio vied 
case of displays in use (in all cases, me 
took the form of scalar indicators), Tab) 97, 
was constructed. In this case, x? = this 
which is significant at p=< 01. In the 
case, the proportion tended to be less in 
case of older workers, k 

Machining. The proportion of the vo 
cycle spent by machinists with machines “xing 
ally running, as opposed to setting, chec Jeu- 
or fitting or removing components, was Ke 
lated. Again, a contingency table E i jn 
structed from these data, and this is shov hich 
Table 3. These data gave x? = 8.75, on 
is significant at p= < 02. The prap ip f 
tended to be greater in the older sgt a 
workers, implying that older workers 


Table 3 


Machining Time 


Proportion of Used/Total Controls 


. Cycle 
Proportion of Work Cy 


Spent Machining a 
<40 40-60 >.60 <40 40-.60 2 
7 10 
Younger Workers 6 10 10 Younger Workers 8 4 19 
Older Workers 1 7 20 Older Workers 1 7 


x CePtor si 


Identification of Job Activities 


Table 4 
Attending to Work Visually 


Proportion of Working Day Spent 
Looking at the Work 


<.30 .30-.70 >.70 
ia Worksis 4 16 22 
der Workers 13 25 8 


t 
ay less time than younger workers in 
He and checking, etc.* 
ae Attention. The use of this expres- 
Used A sina somewhat misleading. It was 
Spent ga culating the proportion of the time 
boris ring the working day actually looking 
the a of the work, Table 4 shows 
Md olde ee in the case of younger 
Which cer workers. In this case x? = 14.38, 
spent ic. significant at p = < 01. The time 
in the ooking at the work tended to be greater 
np ounger group. ` 
sl Swe dition to these features, one other 
nearly a nonsignificant difference which 
Which a significant at the 5% level, and 
eserves mention in passing. The 


» Mo; 
unt of activity of hands and arms dif- 


fer . 
in a a two groups: it was generally less 
Not eee group. Whilst this result can- 
that age too seriously, it is interesting 
18 intere in (1) quotes a similar finding. It 
ated Oe also that recent work has indi- 
€ an at the monitoring of movement may 
activity which is a source of increasing 


di 
fficulty with age (5, 11). 


Discussion 


we findings relating to machine controls 
Dret, isplays are extremely difficult to inter- 
facie hey may be taken to provide prima 
sis (1p) ctal support of Welfore's hypothe- 
Main} that deterioration occurring with age 

Y affects central organization On the re- 


de. Alternatively, a concept of “re- 


B 

Dis j 
a e ne ancies between total numbers in groups in 
an cl eray tables relating to control, displays, 
angan mapo are due partly to the fact that on 
bee Part], chines no scalar indicators were present, 
WwePottion to variations, in one or two cases, in the 
ich coul of the work cycle occupied by machining 

d not be accurately determined. 


281 


dundancy” may prove useful in interpreting 
the findings, and planning clarificatory studies. 

It is quite likely that stooping and watch- 
ing occur together, and that both occur more 
frequently when machining is not in progress. 
Again, further investigation should clarify 
these issues, and one of the more important 
results of such a further investigation should 
be the identification of which of these fea- 
tures, if indeed they are related, is the most 
critical, as it is likely that two of them may 
only appear to be critical in view of their 
being always associated with the third, the 
really critical feature. At any event, these 
three features define a broad area which 
should warrant closer examination. 


Summary 


Approaches to job study preliminary to the 
modification of industrial equipment for the 
use of older workers are discussed. A study 
is described in which the work of a younger 
group and an older group of engineering work- 
ers is examined in order to identify broad 
areas in which features critical to the effective 
performance of older workers are to be found. 
The results suggest two broad areas in which 
more detailed study should be repayed, and 
in which modifications may prove effective. 
These relate to the existence of redundant 
controls and scalar indicators upon machine 
tools, and the prevalence of stooping and the 
closeness with which work has to be watched, 
in relation to certain machining activities 
other than those involved when machines are 


actually running. 


Received December 11, 1957. 


References 


1. Belbin, R. M. Difficulties of older people in in- 
dustry. Occup. Psychol., 1953, 27, 177-190, 

2. Birren, J. E. Age changes in speed of simple re- 
sponses and perception and their significance 
for complex behaviour. In Old Age in the 
Modern World. Edinburgh: Livingston, 1955, 
235-237. 

3, Bridges, C. Job placement of the physically 
handicapped. New York: McGraw-Hill, 1946. 

4. Charters, W. W. & Whitley, I. B. Analysis of 
secretarial duties and traits. Baltimore: Wil- 
liams and Wilkins, 1924. 


282 


5. Griew, S. Age changes in the initiation of re- 
sponses. Proceedings of the 15th International 
Congress of Psychology, 1957. In press. 

6. Hanman, B. Physical capacities and job place- 
ment. Stockholm: Nordisk Rotogravyr, 1951. 

7. Murrell, K. F. H., Griew, S., & Tucker, W.A. Age 
structure in the engineering industry: A prelimi- 
nary study. Occup. Psychol., 1957, 31, 150-168. 

8. Murrell, K. F. H., & Griew, S. Age structure in 
the engineering industry: A study of regional 
effects. Occup. Psychol., 1958, 32, in press. 


S. Griew and W. A. Tucker 


9. Rupe, J. C. Research into basic methods and 
techniques of Air Force job analysis—IV. San t- 
Antonio, Texas: Lackland Air Force Base, 
Res. Rep. AFPTRC-TN-56-51, 1956. 

10. Welford, A. T. Skill and age: An experimental 
approach. London: Oxford Univer. Press, 
1951. 

11. Welford, A. T. Psychological aspects of ageing. 
In W. Hobson (Ed.), Modern trends in geri- 
atrics. London: Butterworth, 1956. 


N 


Journal of Appli 
Vol, 42, Woh tasg? doey 


The Effect of Display Width in Merchandising Soap 


Douglas H. Harris 


Occupational Research Center, Purdue University 


wor many years, grocers have used the 
chnique of increasing the shelf display 
ean of canned goods to sell out more 
aed an inventory of slow selling merchan- 
Ea The success of this procedure would 
oa to indicate that shoppers in self-service 
ihe buy on impulse and, thus, are greatly 
ee by conditions at the point of selec- 
- On the other hand, however, this may 
© true only when infrequently advertised 
Products are involved, since in this situation 
eae loyalty” may not become a deterring 
Ctor in impulse buying. 
The purpose of this experiment is to deter- 
a whether or not buyers in a self-service 
Ore are influenced to buy relatively more 
an Tucts from a wider display than from a 
arrower one, when the products involved are 


well advertised. 


Method 


H ction of markets. Three supermarkets were se- 

Uper in Lafayette, Indiana, on the basis of location. 

city "market I was located on the east side of the 

Upe in a neighborhood of lower income families. 

city tmarket IT was located on the south side of the 
tae a neighborhood of higher income families. 

sity, tmarket IIT was located next to Purdue Univer- 
lection of soap brands. Two soap powder prod- 

With t selected within each market in accordance 

Fase following criteria: 

actuen” products must be classi 

Bents), as heavy detergents (all-purpose 
2 

noti products must be handled onl: 

È Bat giant sizes in the store. | ms 

ee Products must be packaged in a box whl 
4, pY blue in color. 

oth products must sell for the same 

here must be no sales promotion gim 
ith either product. 

oth products must be well advertised, both 


Natio 
> nally and locally. for the previ 
boxes for 3 
her must be 6 to 


Ous € sales ratio, in giant size 
4 in oath, of one product to the ot! sl 

the gach market. There is nothing special about 
hosen © 4 ratio except that when products were 
to meet the above criteria and also to pro- 


le 


fied by the manu- 
deter- 


ly in the 


price. 
i ick con- 
Necteq mick c 


vide the same sales ratio in each market, the 6 to 4 
ratio in each store resulted. The largest selling prod- 
uct will be called A, the lesser selling product B. 

Procedure. Three shelf-display situations were 
used in each of the three stores: 


Situation one: 3 facings of A—1 facing of B 
Situation two: 2 facings of A—2 facings of B 
Situation three: 1 facing of A—3 facings of B 


In each situation, the depth and height of each 
product display were kept equal. The two products 
were located side by side toward the middle of the 
soap section and this position remained constant 
throughout the experiment. The display width of 
the smaller size of each product was constant at 1 
facing each, located to one side of the giant size dis- 
play and with depth and height kept equal. 

The display situations were changed in each store 
after a total of 10 boxes of both products had been 
sold, Starting time for each store was on a Wed- 
nesday afternoon with the display being changed in 


the following order. 


Market Market Market 
I Il I 
First Display lAand3B 2Aand2B 3Aand1B 
Second Display 3A and 1B 1Aand3B 2A and 2B 
Third Display 2Aand2B 3A andiB  1Aand3B 
Results 


The results are presented in Table 1. Chi 
square for the difference between obtained and 
expected frequencies is not statistically signifi- 


Table 1 
Number of Selections by Supermarket Shoppers 


Store 

Combined $ Ir II 
Product Product Product Product 

Display A B AB A B AB 
Expected 18 12 64 6 4 6 4 
3A and 1B 20 10 73 S5 8 2 
2Aand2B 22 8 ora ei 64 
7 3 64 8 2 


fAand3B 21 9 


283 


284 


cant at the 10% level for the combined store 
totals nor for any one store. It is recognized 
that the power of this test for the size of sam- 
ples involved is not great, but it is supported 
by the additional evidence that in not one 
store is there a trend in the direction of the 
alternate hypothesis. 

This study demonstrates, then, that increas- 
ing the relative display width of a packaged 
soap product does not increase the relative 
sales of that product. 


Douglas H. Harris 


Summary 


The relative display widths of two well-ad- 
vertised packaged soap products were varied 
in each of three supermarkets. 

The resulting selections by a total of 90 
shoppers indicated that increasing the rela- 
tive display width of a well-advertised soap 
product does not increase the choices of that 
product by self-service store shoppers. 


Received December 20, 1957. 


E 


Journal of Appli 
Vol. 42, Nok. ted Reveholory, 


Accuracy of Recall Using Keyset and Telephone Dial, and 
the Effect of a Prefix Digit * 


R. Conrad 
Applied Psychology Research Unit, Cambridge, England 


A topic of current interest to telephone en- 
om is concerned with the relative merits 
: = telephone dials and a decimal 
i of keys (keysender) for transmitting tele- 
PE numbers. One advantage of the key- 
Ae is that the telephone operator is not 
mee by the interdigital pauses that occur 
teal ials, so that speed of sending may be 
pared increased. The two methods are thus 
Ser oe by the fact that in dialling, the 
send a is partly paced because the upper 
imit is set by the design of the instru- 
Ment, 

e recent study (6) using eight-digit mes- 
es has shown paced recall to be inferior to 
P paced recall. This finding would have little 
france in the problem of the relative 
ine of dial and keysender so long as the 
gth of telephone numbers was well within 
oo span of immediate memory, but 
ae use of long digit sequences in na- 
digit trunk numbering systems (up to 10 
liable in Great Britain), errors of memory are 
paced to occur. Although the conditions of 
Bos recall in the experiment referred to 
thana were considerably more constrained 
itse 1s usual in dialling telephone numbers, 
OS justifiable to determine whether the 
s would generalize to a more realistic 

eld situation. 
te E feature of many trunk number 
with P that all trunk numbers—as CO 
digit ocal area numbers—are prefixed by a 
a Which is always the same and which acts 
„a switch, In Britain the digit 0 is used. 
ee the appropriate class of numbers, this 
is redundant. Nevertheless, for any one 
aay reasons one might predict that its 
nce would increase the probability © 


2 
Post, ot author wishes to thank the British General 
Provign:® and the Union of Post Office Workers for 
advised © the facilities for this experiment. M- Stone 
ed ou, Statistical treatment, and Barbara A. Hille 
Bri out the tests. The work was supported by 
itish Medical Research Council. 


ing sys- 
ntrasted 


Carri 


the 


memory failure. In spite of its redundancy, 
for instance, the prefix might be treated as an 
extra digit, which if it occurred in a critical 
region of the immediate memory span would 
lead to increased error. Or by merely delay- 
ing the transmission of the succeeding digits, 
the processes of decay of memory might be 
hastened. A second aim of the experiment to 
be reported therefore was to test the hypothe- 
sis that the use of a redundant prefix digit 
would have no effect on memory errors. 


Method 


Two types of instrument were used. 


Apparatus. 
d of two horizontal rows 


The keysender was compose 
of circular keys numbered 1-5 and 6-0, from left to 
right. These were mounted as a normal part of a 
telephone operator's training position. Pressing a 
key was registered as an illuminated number on a 
panel located some distance away and behind iS: 
When an operator keyed out a sequence of digits, 
the entire sequence remained illuminated for some 
10 sec., enabling E to record the order in which keys 
were pressed. S indicated the end of a sequence by 
pressing a key marked FIN. 

The second instrument was a conventional British 
G.P.O. dial telephone mounted in front of S$ on the 
same training position as the keysender. The se- 
quence of digits dialled was automatically recorded 
in clear print on a Zoller Recorder situated in an- 


other room. 
Digit messages were record 
recorder at a rate of 100/min., with an interval be- 


tween messages for S to respond. The output of the 
recorder was fed into a pair of headsets, one worn 
by S and one by E. The S and E could also talk to 
each other through the same headsets. 

Test material. Throughout, eight-digit messages 
were used, the digits being drawn from a decimal 
vocabulary. The messages were carefully constructed 
jn such a way that each digit occurred an equal 
number of times in each serial position, obviously 
easy phrases being avoided. Chi-squared tests at 
the end of the experiment showed that there were 
no significant differences among the messages in 
terms of frequency of correct recall. Eighty mes- 
sages were constructed and arranged in four lists 
each of 20 messages. 

Subjects. The Ss were 24 female Post Office tele- 
phonists, who had the special merit of being thor- 


ed on a Ferrograph tape 


285 


286 


oughly experienced in the use both of the keysender 
and the dial. The keysender was normal equipment 
in everyday use, whilst the dial was sometimes used 
during work, and always used out of working time. 
No special training was therefore necessary. All Ss 
were volunteers, and the testing was carried out dur- 
ing working hours. : K 
besten. There were four experimental conditions 
designated as follows: 


K..........Keysender 

-Keysender used with prefix digit 0 
-Dial used with prefix digit 0 

- Dial 


These four conditions were randomized in six differ- 
ent 4 X 4 latin squares, and each S was tested under 
each condition with a different list of Messages, 
Procedure. In the K and D conditions, Ss were 
merely told to listen to the messages and when each 
had ended to key or dial what they had heard. Be- 
fore the KO and DO conditions, the same instruction 
was given, but Ss were told that all messages were 
to be prefixed by the digit 0 which was not on the 
tape. Messages beginning with 0 were to be treated 
in the same way. Since Ss were familiar with the 
use of prefix digits, no difficulty was encountered in 
giving these instructions, Each condition required 
about 10 minutes and Ss did two conditions on one 
day and two on the next day. 


Results 


A message was scored as correctly repro- 
duced only when all digits were given in the 
correct order. The mean scores of each of 
the six latin Squares are given in Table Í; 
from which it will be seen that the differences 
between conditions are in the predicted direc- 
tion. Since the scores of individuals for each 
condition are out of 20, they were normalized 
by making-the refined angular transformation 
due to Anscombe (1). Analysis of variance 


Table 1 
No. of Correct Messages (Max. = 20) 


K KO po D 
Mean of Square 7.98 S75 "agp Ge 


ean of Square 


1 175 5.75 450 450 
2 11.50 9.00 7:75 1130 

3 11.50 9.00 7.50 800 

4 14.00 12.00 8.50 13.59 

5 10.25 9.75 4.75 75 

6 13.25 8.75 875 13.59 

Mean 11.38 904 6.96 10,13 
= ae 


R. Conrad 


Table 2 


Analysis of Variance for Number of Correct Messages 


Mean 


Source of Variation df Square F P 


Squares 5 2,446.00 
Subjects within si 
Squares 18 28,651. P 
Test Order 3 1,105.57 3.52 oy 
Conditions 3 3,018.28 9.61 - 
Order X Squares 15 197.31 
Conditions X Squares 15 215.07 
Residual 36 314,12 
Total 95 


were 
Note.—The error variances of the six latin belo Fs the 
tested and found to be homogeneous before poo! 
above table, 


was then carried out, the results of which are 
summarized in Table 2. “a 
Of major relevance is the variance be t 
conditions which shows differences ig 
at better than the .001 probability level. S 
differences between pairs of conditions 8) 
tested by Duncan’s multiple range test c 
used at the .05 level with the following T 
sults: sa dot 
Dial versus keysender. This result is z 
conclusively shown. The keysender is rae 
significantly better than the dial in the si of 
ple eight-digit condition. When the are ar 
a prefix digit is added (KO v. DO), a © a 
advantage for the keysender, which is sign is 
cantly better, is seen. The nature of 
Stress will be referred to later. digit 
Effect of prefix digit. Adding a prefix ges 
results in significantly fewer correct messa 
Whether the keysender or dial is used. Aa 
glance at Table 1 shows how consistent 
Pronounced this effect is. MRES 1 
Digit confusion. Errors occurring “ea 
reproduction of a message can be cae 
into two broad groups: first, order re 
when S has clearly transposed usually rec 
but sometimes more, digits. The ker 
digits are given in the wrong order 1m erely 
a way as to suggest that S is not Eri þe 
guessing; e.g., the message 80274163 Woe 
reproduced as 80271463, The second fin 
comprises all other errors of which three K! its 
are likely: (a) § forgets one or more 


l 


Accuracy of Recall 287 


and leaves blanks, (b) S forgets one or more 
digits and guesses, (c) S consistently con- 
fuses certain pairs of digits. In most cases, 
omissions are obvious, and guesses are equiva- 
lent to omissions. But there is the problem 
of distinguishing guesses from genuine con- 
fusions. It can be assumed that if S guesses, 
he will choose all possible digits with equal 
probability. Then a digit which occurs in the 
place of another digit more often than would 
be expected by chance can be regarded as 
genuinely confused. In the present experi- 
ment, this analysis was simplified because 
each digit occurred with equal frequency in 
the test messages. 

The data from all four conditions were 
Pooled, and a confusion matrix set up. The 
expected value in each cell was one ninth of 
the total number of times each digit was 
Wrong, since each digit could be confused with 
any of nine others. The difference between 
Observed and expected distribution of errors 
amongst the nine possible digits was sepā- 
rately calculated for each of the 10 digits 
used. Yates’ corrected chi squared was used 
to test these differences and in only one case 
Were the two distributions significantly differ- 
ent (.05 level). The digit 2 was called 3 
moe often than would be expected by chance. 

ut 3 was not called 2 beyond chance fre- 
pay . The only special feature of this con- 
usion is the contiguity of the digits in the 
numerical scale and on the layout of key- 
Sender and dial. Since no other contiguous 
Pairs were confused, this particular relation- 
ship appears to be of no special significance. 

l other apparent confusions must be re- 
garded as guesses, i.e., completely forgotten. 


Discussion 


The effect of prefixing a message with a 
redundant digit is clear. It will be seen from 

able 1 that in the case of both keysender 
and dial none of the six squares shows a 
Counter effect. The simplest explanation 
might be that Ss treat the prefix as an extra 
digit making the message effectively one of 
ae digits. In general, this is certainly not 

ue, since the proportion of errors at each 
Serial position is the same in both prefix and 


nonprefix conditions. If the prefix digit were 
regarded as conveying as much. information 
as the others, it would itself be subject to 
some error and the first digit of the message 
proper would show as much error as the sec- 
ond digit in the nonprefix condition. Neither 
of these two effects occur. It seems fairly 
certain that all Ss treat the prefix 0 as being 
redundant. Giving due regard to the differ- 
ence between conditions, the proportion of 
correct messages recorded are compatible with 
those from a similar group of Ss for eight- 
digit messages, and incompatible with previ- 
ously reported scores for nine-digit messages 
(5). 
A second possible explanation might be 
along the lines of decay theory suggested by 
Brown (4), Broadbent (3) and others. On 
this view, the longer the interval between 
presentation and recall of a digit, the greater 
is the chance of forgetting. This explanation 
would satisfy the data for the dialling condi- 
tions. Dialling the digit 0 interposes a rela- 
tively long delay before the required mes- 
sage can be recalled. But this cannot so 
justifiably be claimed for the keysender. De- 
lay is indeed present, but it is very short; yet 
the increase in error is almost as large as it is 
in the case of dialling. 

It may be that merely remembering the 
prefix diminishes ability to recall the message, 
and this could occur if it interfered with im- 
mediate postpresentation rehearsal, Although 
the explanation of this effect is uncertain, 
some of the possibilities discussed could easily 
be tested. ‘ 

The advantage of keysender over dial has 
only partly been shown. But the effect which 
shows when the prefix is used, is so pro- 
nounced that there can be little doubt about 
it. The extra stress of the prefix not only 
worsens performance, but also differentiates 
between keysender and dial. It would be 
tempting to think that had nine-digit mes- 
sages been used, the expected effect would 
have been more clearly demonstrated. It will 
d that the predicted effect was based 
on the results of a previous experiment em- 
ploying paced recall (4). In the present ex- 
periment there was the important difference 
that S was free to rehearse the message be- 


be recalle 


288 R. Conrad 


tween presentation and recall. That this dif- 
ference is important is evident from the dif- 
ference in performance level between the two 
comparable groups of Ss. In the present ex- 
periment, the nonprefix paced recall (dial) 
condition yields about 50% correct messages. 
In the earlier study without rehearsal, the fig- 
ure is about 35%. In summary, it seems rea- 
sonable to conclude that if the difficulty of 
recall is such that less than half the messages 
are correct, then the keysender will show a 
significant advantage over the dial. 

The analysis of digit confusions indicates 
that there is no feature of the immediate 
memory function which could lead to system- 
atic confusions of one digit with another for 
whatever reason. It appears that if con- 
fusions occur, they must be ascribed to weak- 
ness elsewhere in the communication system. 
Indeed it has been shown (7) that when 
spoken digits are automatically recognized by 
a machine, if the digit 2 is confused, it is 
fairly likely to be called 3. In fact only two 
kinds of error in immediate memory for digits 
occur, and these are order errors and omis- 
sions. The systematic changes in material 


that are characteristic of long term memory 
(2) do not seem to appear. 


Summary 


A test of immediate memory for eight-digit 


female telephone 


y worsened recall. 
scribed onto a 10- 


digit keysender, recall was not significantly 
better than when transcribed onto a telephone 
dial. But when a prefix digit was introduced, 
the dial proved to be an inferior method of 
transcription. It would seem that at about 
the level of difficulty when more than half the 
messages would be forgotten, recall would be 
improved by use of keysender rather than 
telephone dial. ae 

Recall errors were analyzed digit by digit. 
All errors could be classified into order errors 
and omissions. No evidence was found that 
certain digits would be systematically con- 
fused with certain others. 


Received February 3, 1958. 


References 


1. Anscombe, F. J. The transformation of poisons 
Binomial and negative binomial data. Bio 
metrika, 1948, 35, 246-254. ee 

2. Bartlett, F. C. Remembering: A study in aie 
perimental and social psychology. London: 
Cambridge Univer, Press, 1932. hu- 

3. Broadbent, D. E. A mechanical model for J 

man attention and immediate memory. PSY 
chol. Rev., 1957, 64, 205-215. f 3 
- Brown, J. Immediate memory. Unpublished ae 
toral dissertation, Univer, of Cambridge, a 
- Conrad, R., & Hille, Barbara A. Memory | 
long telephone numbers. Telecommunications, 
1957, 10, 37-39, of 
- Conrad, R., & Hille, Barbara A. Decay theory d 
immediate memory and paced recall. Canat. 
J. Psychol., 1958, 12, 1-6, toe 
7. Davis, K. H., Biddulph, R., & Balashek, S. aid 
matic recognition of spoken digits. In psa 
Jackson (Ed.), Communication Theory. 953. 
don: Butterworth’s Scientific Publications, 1 vk: 
- Federer, W. T. Experimental design. New Yo! 
Macmillan, 1955 


a 


Journal of Appli 
Vol. 42, (Cad aa 


AVA as a Predictor of Occupational Hierarchy * 


Peter F. Merenda and Walter V. Clarke 
Walter V. Clarke Associates, Inc. 


The Activity Vector Analysis (AVA) is a 
self concept instrument (4) measuring human 
temperament. It has been widely used in in- 
dustry as a tool for the classification and se- 
lection of personnel both in the management 
and worker hierarchies (2, 3, 5). AVA is 
based upon the fundamental premise that 
over and above basic aptitude and ability 
factors, successful performance in any job is 
largely a function of personal temperament 
and behavior. It is founded on a theory of 
Personality (1) which postulates that all hu- 
man behavior can adequately be described in 
terms of four areas: aggressiveness, sociabil- 
ity, emotional stability, and social adapta- 
bility 
AVA is a list of 81 nonderogatory words 
Which may be used in describing human be- 
havior. The testee is required to first check 
those words which have ever been used by 
anyone in describing him (Column 1) and 
oe to go back and check those words which 

honestly believes to be descriptive of him- 
Self (Column 2). Scores on this instrument 
are reported on a standard scale with X = 50, 
*=10. Ordinary standard scores are ob- 
tained for each of four vectors representing 
a clustering of the words checked and indi- 
cating the following behaviors: 

ch behavior in an an- 
n, real or imaginary 
h behavior in a 


Vel Positive, approa' 
tagonistic situatio 
V-2 Positive, approac : 
friendly situation, real or imaginary 
V-3 Negative, withdrawal behavior in a 
friendly situation, real or imaginary 
Negative, withdrawal behavior 1n an 
antagonistic situation, real or imagi- 


nary 


Milo study was designed to 
ent to which AVA can meas 


determine the 
ure the tem- 


Der; pe is- 
satel characteristics purported to dis 
The principal contents of this article were pre- 


fae theastern 
P.: at the annual meeting of the Southeas 
19¢¢hological Association, Atlanta, Georgia, April 28, 


tinguish between male members of the mana- 
gerial-supervisory occupational level and those 
of the routine operation worker level. 


Procedure 


Concurrent Sample 


The employees of a large industrial concern were 
divided into two occupational categories: higher and 
lower. All Ss included in this sample were males 
and had attained their individual occupational status 
prior to taking the AVA. No one was chosen as an 
S who was selected, transferred, or promoted to one 
of the occupations included in the a priori chosen 
categories as a result of the AVA. The higher level 
class consisted of executives, managers, and other 
management level supervisors. The lower level class 
consisted of mechanics, machinists, machine opera- 
tors, draftsmen, maintenance men, and laborers. No 
professional employees such as engineers and lawyers 
who were employed by this company were included 
in the sample. Only those occupations were studied 
in which progress from the worker to the managerial 
levels would be possible regardless of degree of for- 
mal education or training of the incumbent. Also 
those occupations which had been included in an 

ly those in the sales-clerical and 


earlier study (mainl 
general office fields) were excluded from the sam- 


ples drawn for this study. 
The median age of the members of the higher 


group was 37 with a range of 23 to 54. The median 
age of the members of the lower group was 35 with 
a range of 16 to 57. Hence, the two groups appear 
to be quite well matched with respect to age. 

The Ns for the samples of this study were 47 for 
the higher category and 112 for the lower category. 
An average resultant (Column 1 plus Column 2) pat- 
tern based on the four vector scores was obtained 


ver 
Higher Group V-2 
v3 


v-4 


v-1 
Lower Group v-2 
v-3 


v-t 


er re ee E E E E E 


Average AVA resultant patterns for higher 


Fic. 1. 
and lower occupational groups. 


289 


290 


Peter F. Merenda and Walter V. Clarke 


Table 1 
Distribution and Serial Correlation Statistics for AVA Resultant Vector Scores 


High Lower Both 
(w= 47) W = 112) (N = 159) 

Variate xX oz z Oz z oz hs t tà 
V-1 +451 7.92 —0.46 6.96 +1.01 7.60 +.367 4.95 pees 
V-2 +5.32 7,22 =S 5:52 +0.76 6.75 +.526 7.76 al 
V-3 SAS 5.57 +1.98 7.07 -017 7:41 —.526 7.76 Sin 
V-4 —5.45 5.32 —1.01 4.68 —2.32 5.28 — 467 6.62 < 


Note.—These statistics are provided for th 
of the separation between the 
integration, 


for the higher occupational category (hereafter re- 
ferred to as Group A) and the lower category (here- 
after referred to as Group B), These patterns are 


Accompanying statistical data 
for these average profiles are presented in Table 1. 


- The constant 25 was added 
to each deviation 


tions by removing 
A Fisher Two-Group Discriminan 


Prediction Sample 


A second sample (NV 
from a Population of m: 
of the original sample, 
weights were applied to th 
criminant score was derive; 
sample. Then employing 
tion as a basis for Prediction, the class 
of each of the 76 Ss was determined, 


cance by the x? test. 
A further test was made of the 
determining occupational differenti: 


A trained 
AVA Analyst 2 was 


Using only 


* Person trained by Walter V, Clarke to apply 
AVA theory in the administration and interpretation 
of the instrument. 


e interest of the reader only, 
e two groups of this study since AVA interpret: 


i i alysis 
They were not directly employed in the an: a 
ation is made only on the basis of total patter 


his understanding of AVA theory and his knowl- 
edge of the location of 258 pattern shapes on a 
global universe the Prediction was made. He T 
told how many of the Ss were in Group A a 
Group B, but he was completely unaware of re 
actual job status of the Ss he was asked to classi of 
The predictions were made solely on the basis oe 
correlations between individual profiles and the T % 
reference patterns (Fig. 1). These data were om 
tained from a table of correlation coefficients wi 3 
which he was provided, Comparisons between a 
dicted and actual class membership were then mai 
as in the previous step. 


Results and Discussion 
Concurrent Sample : 
Discriminant analysis applied to the ee 
lem of distinguishing between the two are 
of the original sample on the basis of A 


resultant patterns produced the discriminant 
weights reported in Table 2. 


Table 2 


Discriminant Weights for Vector Scores of 
Two Classes 


Mean Discrimi- 
Difference nant 
Variate (Class A~—Class B) Weig 
Aggressiveness 8 
(V-1) 4.96600 Om 
Sociability 
56 
(V-2) 6.47094 000 
Emotional Stability 6 
ool? ~7.13108 08 
ocial Adaptability 0 
(V-4) — 4.43788 —.0007 


A 


AVA as a Predictor of Occupational Hierarchy 


Pe Table 3 


Analysis of Maximum Separation Between 
Two Groups 


nag E 


Source of S 

na, um of Mean 

Variation df Squares Square 

Sinchion 4 .004635 .001158 
ithin 154 011834 000076 

Total 158 016469 F=15.2368 


ee data reveal that as a group the mem- 

ts of the upper occupational class appear to 

eee aggressive and socially confident, 

m E E those of the lower stratum appear to be 

diffe placid and submissive. This pattern 

ane is consistent with AVA theory 

A cated states that leadership qualities as indi- 

nec by outgoing, self-initiating behavior are 
ra €ssary requisites of successful performance 
onde eetial-supervisory positions whereas 
at ba, ward more relaxed and depend- 
o ehavior are important at the routine- 
Perational level to insure high quality and 
Wantity of production. 
anap e analysis of variance test applied to the 
two lysis of maximum separation between the 
ieee oups yielded an F value of 15.2368. 
F e data are presented in Table 3. 
which the 4 and 154 degrees of freedom upon 
ni this statistic is based, statistical sig- 
Cance is indicated beyond the .001 level. 
eae there is ample evidence that the AVA 
terns differ between members of higher and 
Wer occupational classes in this industrial 
Population, 

i a developing a guide to use in the classifi- 

eae of the members of the prediction sam- 

Our on the basis of AVA vector scores, Cen- 

Sco; Scores were derived from the discriminant 

Te distribution for these samples. 

A n deriving the individual discriminant 
pres on which the Centour scores are based, 
© discriminant weights were applied to the 

PPropriate vector scores of the individual Ss. 
tie: Since differentiation is independent of 

Benet used, the discriminant scores were 
rda ormed to a new scale (1,000a + 50) in 

tive T to remove the decimals and the nega- 

signs. Tt was found that the discriminant 


Sc > A 
\ ores with the greatest predictive value for 


291 


the highest occupational category were those 
above 36. For the lower occupational group 
the discriminant scores with the highest pre- 
dictive value were those below 36. The score 
of 36 appeared to have little predictive value 
for either group since the probabilities of cor- 
rect classification were nearly equal for both. 
For Group A, the mean discriminant score 
was 42; for Group B, the mean was 30. 


Prediction Sample 

Discriminant scores were calculated for 
each of the 76 Ss in this sample employing 
the discriminant weights derived from the 
data of the concurrent sample. Then using 
the Centour score data, the Ss were classified 
into one of the two occupational classes. In 
this classification process, the discriminant 
score of 36 was used as the Group A cutoff 
and 35 was used for the Group B cutoff. The 
score of 36 was used as the Group A cutoff 
because there was a slightly greater chance 
of correct classification in this category than 
for Group B. The results of this classifica- 
tion procedure are summarized in Table 4. 

Out of a total sample of 76 Ss, 62 were 
predicted correctly as to hierarchial member- 
ship. The x° value for the data of this four- 
fold table is 27.34 and is significant beyond 
the .001 level of significance. 

The results of the AVA Analyst proved to 
be equally good in predicting the dichotomy. 
Because he judged the shapes of 6 patterns to 
be invalid the total sample was reduced to 70 
for his predictions. The profiles of the 6 in- 
dividuals were nearly straight vertical lines so 
that no vector emphasis was indicated. It is 
interesting to note, however, that all were 
members of the lower category and this find- 


Table 4 


Predicted vs. Actual Occupational Classifications 
by Experimenters 


Predicted 
B A Total 
Actual 
A 9 25 34 
B 37 5 42 
Total 46 30 76 


“292 


Table 5 
icted vs. Actual Occupational Classifications 
= by AVA Analyst 
Predicted 3 A iul 
Actual 
A 8 22 30 
B 32 8 40 
Total 40 30 70 


ing is consistent with that of an earlier study 
in which an identical number of nonclassifi- 
able patterns was obtained and they also were 
all from the lower group. The results of these 
predictions are reported in Table 5. 

The reason why the AVA Analyst did not 
attempt to classify these 6 Ss is that in the 
normal processing of AVA results such pat- 
terns are not interpreted. However, referring 
to Figure 1, there appears to be a sound ba- 
sis for the classification of these Ss into Cate- 
gory B where the average profile is much less 
extended as compared to the same for Cate- 
gory A. Out of a total of 70 Ss, 54 were pre- 
dicted correctly as to Class membership. The 
x? value for the data of this table is 17.79 
and is significant beyond the .001 level of 
significance. Of 70 common predictions made 
by the experimenter and the AVA Analyst, 66 
proved to be identical. 


Summary and Conclusions 
Two different ex 


perimental approaches were 
studied with refer 


ence to the measurement of 
differences in temperament possessed by male 


members of higher and lower occupational 
classes in an industrial Population, and in 
using this information to Predict class mem- 
bership. One involved a rigorous statistical 
analysis technique; the other, a rather sim- 
ple and unsophisticated Procedure. Both 
methods proved to be highly successful in 
this respect and have attested to the power 
of AVA in the hierarchial classification of 


Peter F. Merenda and Walter V. Clarke 


nonprofessional male employees on the basis 
of temperament characteristics. The findings 
of this study confirm the existence of differ- 
ences in temperament characteristics of per- 
sonnel of higher and lower echelons of em- 
ployment which were found in an earlier 
study based on mixed-sex samples drawn ee 
a business population. These findings furt er 
confirm the power and efficiency of AVA in 
measuring these differences, and in per 
the proper classification of personnel accor i 
ingly. They also suggest temperament cr 
teria to be evaluated in personnel age 
and assignment, and when considering es 
promotion of nonprofessional male employe 
to supervisory and managerial levels. ig 

The following conclusions are held to 
tenable from the data of this study: p 

1. Differences in temperament stat 
istics exist between employees of higher an 
lower echelons. a 

2. AVA can be efficiently used in the naj 
archial classification of male industrial en 
ployees. 


Received February 17, 1958. 
Early Publication. 


References 


" j- 

1. Clarke, W. V. Activity vector analysis. we 
dence: Walter V. Clarke Associates, Inc., ffice 

2. Clarke, W. V. Personality profiles of loan? 
managers. J. Psychol., 1956, 41, 405-41 made 

3. Clarke, W. V. Personality profiles of BF 1, 
company presidents. J. Psychol, 1950, 
413-418, ; jal 

4. Clarke, W. V. The construction of an age ' 
selection personality test. J. Psychol, 

41, 379-394. life in- 

5. Clarke, W. V. The personality profiles of 95-3 . 
surance agents. J. Psychol., 1956, 42, 2 ments 

6. Fisher, R. A. The use of multiple meast n 
in taxonomic problems. Ann. Eugens 
1936, 7, 179-188. 

7. Tiedeman, D. V., Bryan, J. G, & Rulon 
The utility of the airman classification 
for assignment of airmen to eight Air 
Specialties. Cambridge: Educ. Rcs- 
1951, Appendix C. 


P. J- 
attery 
Force 


Journal of Applied Psychology 


Vor. 42, No. 5 


OCTOBER, 1958 


Guttman-Type Scales for Union and Management Attitudes Toward 
Each Other’ 


Ross Stagner 


Wayne State University 


W. E. Chalmers and Milton Derber 


University of Illinois 


fae importance of “the attitudes of the 
ogniz A toward each other” has long been rec- 
Pa in industrial relations. Many writers 

ia used the concept, with and without ref- 
eves e to specific indices of attitude. How- 
Syste so far as we know, no one has done any 
a pee work on the preparation of scales 
able e quantitative measurement of this vari- 
(eg. _ Such scales as have been published 
ar iene Illini City studies, [4]) have been 
subjectes put together and have not been 
Scales a to scaling analysis.* Many of these 
ocal an ee Delude specific items referring to 
Paratiy, uations, hence were not useful in com- 
e studies across establishments. 

o te the above reasons, it seemed d 

AOSE pS generalized scales which c 
for th ed to management and union officers 
Satem purpose of obtaining 4 quantified 
text see of attitude. The substantive con- 
18 re ithin which the scales were to be used 
3), Ported in other publications (see, ©8- 
to th he present report is therefore limited 
theme, technical characteristics of the scales 

selves, 


esirable 
ould be 


our appreciation to 
nois University and 
Herbert Schaeffer, 
don Luskin, 
phases of 


1 
V 
Miton should like to express 
to orm delman of Southern Illi 
Robert Va graduate assistants, 
and jet, Nooy, Robert Mitchell, Shel 
the Coll n Tipton, who aided in various 
«2 Tt spion and analysis of the data. 
a the pud perhaps be noted that the items used 
Ployees llini City studies with rank-and-file em- 
that tee analyzed by Williams (5) who found 
Union i athe attitude-to-company and attitude-to- 
hese we ems formed acceptable scales. However, 
ere not used with union or company officials. 


=——" 


293 


Selection of scale items. Since it was our 
desire to prepare a generalized scale covering 
the major elements which might be involved 
in the “attitude of management toward the 
union” or in the “attitude of union toward 
management,” we selected items from previ- 
ous scales, and from discussions in the litera- 
ture, which seemed likely to have such gen- 
eralized applicability. Thus, in the scale for 
unions, items were included about manage- 
ment policies, about formen, middle and top 
ment, about managerial use of power, 


manage! 
Parallel questions were framed to 


and so on. 
be presented to executives for attitude toward 
the union. 

Respondents. The scales presented here 


are apparently unique also in another respect. 
They are based upon pooled answers from 
two respondents, instead of treating each re- 
spondent as a separate member of the popu- 
lation. This was an important component of 
our research design. Data regarding each es- 
tablishment (with few exceptions) were based 
on the pooled answers of the two top union 
officials, those who bargained directly with 
the company, and on the pooled answers of 
the two top executives who handled union re- 
lations for management. This procedure was 
followed to increase reliability and to decrease 
the role of subjective biases in the perception 
of situations. Specifically, in the case of the 
there were numerous disagreements 
between our pairs of managerial respondents even on 


“factual” issues such as seniority practice. The use 
of two informants presumably gave us more “objec- 


2 ees 
3 For example, 


294 


attitudinal material, we wanted a figure which 
represented “the position of management, 
not a single person’s attitude, and similarly 
for the union. i 

The analysis reported here is based upon an 
N of 41, meaning that interviews were con- 
ducted in 41 establishments in three down- 
state Illinois communities. Actually, 76 ex- 
ecutives were interviewed (in 6 companies, 
only one executive could be found who was 
actively involved in union relations). Simi- 
larly, the total number of interviews was 81 
for union officials; in all but one case, two 
men were interviewed to obtain the u 


nion po- 
sition. Interviews were conducted 


in plant 
or union offices by our field interviewers. 
Each item with its response alternatives was 


presented, and the interviewer marked the re- 
Sponse selected. Respondents had no oppor- 
tunity to consult with each other regarding 
answers to these items, 
The population of 41 establishments has 
been described elsewhere (1). Suffice it to 
say that the range of sizes (hourly paid em- 
ployees) was from 73 to 2,100 employees; 
some were in manufacturing, others in utili- 
ties and Services; among the manufacturers, 
both producer and consumer goods were in- 
volved; and a wide variety of unions func- 
tioned as bargaining agents. However, these 
41 firms were restricted to cases which were 
not branches of larger organizations, and to 
cases in which a single union 
majority of the work force. 


Since each item was presented as a four- or 
five-alternative multiple-choice question, pool- 


Perfect 
two spokesmen was 
not common. For managers, agreement was 


found on five items in at least 50% of firms, 
For unionists, seven items showed agreement 
in at least 50% of the Cases. Since 14 items 
were used, it is clear that neither “the man- 
agement position” nor “the union Position” js 
clearly established and unequivocally stated 
in the typical relationship, However, it should 
be noted that differences in response were 
tive” data than reliance on a single person, especially 
since we reinterviewed on “factual” 


D x questions in case 
of disagreement. No reinterviews were conducted on 
attitudinal items, 


Vay 


bargained for a 


Ross Stagner, W. E. Chalmers, and Milton Derber 


usually limited to a single step on the an- 
r scale. i 

pon the responses of two — 
plies the assumption that the answer SPa 
represent equal intervals. This saun i 
has not been tested. Since the final pro E 
is presumed to be only an ordinal scale, 7 a 
inequality of answer steps would seem no 
introduce a serious question. 


Procedures 


The task of this analysis, in generalized 
terms, is to determine the functional vai 
the specific attitude items, ie, to a ae 
hypothesis that there is a single one rep- 
hypothetical continuum which correct y seat 
resents the position of one party’s a bi tA 
with respect to the other. This woul ize 
contrast to an equally plausible finding, fa 
that each attitude is truly composed 3 die 
number of independent components, and bi 
the selection of an index number to ie 
one party’s position is merely a case 0 ‘nti 
aging positions on specific issues not se oh 
sically related one to another. Resear ly as- 
union-management relations has common ” 6 
sumed that these attitudes are unitary; 
felt it important to test this assump DR ase 

Two methods are commonly used neon 
tablishing the unitary nature of a fu nsest 
revealed in a series of discrete resb Ee 
correlational analysis; and scaling “er used 
For purposes of cross-comparisons, pet weed 
both. Phi coefficients were computed tion of 
all pairs of items, to permit getermin tman’s 
clusters of related items; and Gu lied t° 
scalogram analysis (3) was also app n i 
the data. The following items were SA the 
the analysis (the items are ranked =. p 
final scale; marginal numbers eae ee 
tion in original questionnaire; ee cut- 
parentheses, such as (2.5), indicate dered f2 
ting point between an answer consi ble): 
vorable and one considered unfavora 


osi- 
in 


. de 
Scale for Management Attitu 
Toward Union 
r aders 
Are the union officers effective leader: 
organization? 
a) very much so 
Q)—pretty good 
(3)——mediocre 
(4) very poor 


of thei! 
43. 


(2.5) 


, 


> =a 


ho 44, 


Scales for Union and Management Attitudes 


35. Is the union generally reasonable or not in its 
claims? 

(1) very reasonable 
(2)——teasonable most of the time 
(3)——frequently unreasonable 
(4)——extremely unreasonable (2.5) 

36. Does the union interfere seriously with how the 
company is managed, or does the management 
have a reasonably free hand in running the 
plant? 

(1)——union is no problem 

(2)—it interferes a little but not seriously 

(3)—— it interferes quite often 

(4)——it seriously interferes with management 
(2.0) 

40. Are the union officers interested in the welfare 
of the rank-and-file workers? 
(1) ‘very much so 
(2)——pretty much 
(3)—slightly 

$ (4)——very little (2.0) 

8. Does the union cooperate with management on 
production matters or not? 
(1)—they are extremely cooperative 
(2)—they will go along but not positively 


support 
(3)——they do not interfere seriously but some- 


times are obstructionist 
(4)—they restrict production improvements 
quite often (2.5) 
In general, how do you personally feel about 
Your company’s relations with the union? 


34, 


(1) —-very satisfied 
(2) moderately satisfied 
(3)——moderately dissatisfied 
3 (4) —very dissatisfied (2.0) 
i Has the union tended to weaken employee disci- 
ras or has it cooperated with management on 
isciplinary matters? 
(1)—cooperative and helpful 
(2)— sometimes helps but not always 
(3)—sometimes interferes with discipline 
(4)—has created some serious disciplinary 
problems (2.5) 
Does the union have too much power in 
establishment ? 
(1)—not too much 
(2)—too much in a few respects 
(00 much in many respects 
4)—far too much (1.5) 
Does the union have the support of the workers? 
(1)—most of the workers are strongly behind 


4 
6. your 


it 
(2)—only a few really active people but most 
(3) workers go along 
—not too much feeling either way 


(4)—a lot of the workers are hostile (1.0) 


295 


37. How do you feel about using the union as the 
main channel of communication to the workers 
on company policies? 

(1)——strongly favor 

(2)——moderately favor 

(3)——moderately oppose 

(4)——strongly oppose (combined with 41) 

41, Are the Jocal union officers skillful bargainers? 
(1)——very much so 
(2)——Pretty good 


(3)——mediocre 
(4)——-very poor (favorable if 37-+41=3 or 
less) 


The following items did not scale: 

42, Are the international union representatives skill- 
ful bargainers? 
(1) ——very much so 
(2)——pretty good 
(3) ——mediocre 
(4)——-very poor 
(5)——none are involved 

45. Do the international union officers create any 
serious problems or not? 
(1) they are generally responsible and helpful 
(2)——they are more helpful than troublesome 
(3)——they are more troublesome than helpful 
(4) they generally stir up trouble 
(5)——none are involved 


47. Does the union try to live up to its agreements? 


(1) ——always 

(2) ——sually 

(3) frequently does not 
(4) ——aarely 


Scale for Union Attitude Toward 
Management 


43. Are the top management officials effective execu- 


tives of the establishment? 
(1)——-very much so 
(2)—— Pretty good 
(3) mediocre 
(4)——-very poor (2.0) 
44, What is the top management attitud 
union? 
(1) strongly favorable 
(2) moderately favorable 


(3) moderately unfavorable 
(4) strongly unfavorable (2.0) 


47. Does the company try to live up t 
ments? 
(1) —always 
(2) usually 
(3) frequently does not 
(4) ——aarely (2.0) 


e toward the 


o its agree- 


Ross Stagner, W. E. Chalmers, and Milton Derber 
296 
y ier to get along | 
D the company abuse its power in this estab- o, pe somewhat easic} g 
46. Does ai , 
aie (3)——they are about the same as top manage 
1) rarely nes 
A racecar (4)——they are somewhat more difficult to ge 
ie along with 
2 very often (1.5) 


34. 


36. 


37. 


40. 


In general, how do you personally feel about 
your union’s relations with the company? 
(1)——-very satisfied 

(2)——moderately satisfied 

(3)——moderately dissatisfied 

(4)——-very dissatisfied (2.0) 

Has the management shown any understanding 
of your problems as a union officer? 
(1) ——very understanding 
(2)——understands the union situatio. 


n pretty 
well 


41. 


42. 


(5)——they are much more difficult to get along 
with 5 f = 

Are the top management officials skillful ba 

gainers? 

(1)——very much so 

(2)— pretty good 

(3)——mediocre 

(4) ——very poor i a 

(If a multi-plant company with a Nt 

outside this establishment) Are he n aes 

representatives from the home office ski! 


. ‘ is hee gainers? 

(9 —undištanding of union problems is lim- (a) ERE 
= . — o d 

(4)——little or no understanding of union prob- (2) Pretty goo 


lems 
Has the manage 


(2.0) 


ment tried to undermine the un- 
ion position through direct dealings with the 
workers, or has it been careful to safeguard the 
union position in such contacts? 
(1)—is always careful not to hurt union 
(2)—is usually careful not to hurt union 
(3)—occasionally tries to weaken union 
(4)—irequently tries to weaken union (2.0) 


- Is the top management generally reasonable or 


not when it comes to discussing union claims? 
(1)——-very reasonable 

(2)——reasonable most of the time 
(3)—frequently unreasonable 
(4)—extremely unreasonable (1.5) 

Are the top management officials interested in 
the welfare of the workers? 

(1)——very much so 

(2)—pretty much 

(3)—slightly 


(3)——mediocre 

(4)——-very poor 4 

(5)——not involve re 
(If a multi-plant company with a con oe 
outside this establishment) Do company rious 
sentatives from the home office create se! 
problems or not? ful 
(1)——they are generally responsible i te 
(2)—they are more helpful than trou! helpful 
(3)——they are more troublesome than 
(4)—they generally stir up trouble 
(5)——not involved 


Results j 
i 
1. Attitude oj management toward the 


ion. The 14 X 14 matrix gives 91 phi coe 
cients, of which 26 are positive and eae 
cantly above zero,’ while only 13 are 

tive, one of these being significant. 


f 
‘ ce 0 
° A est the existence 
(4)—very little (1.5) This finding would sugg 


; i 
a single common factor. However, ai a 
of the data indicates that only 7 ee 
Closely related; 15 of their 21 meN re- 
tions are positive and significant. Of ositive 
maining 70 coefficients, only 11 are oe 
and significant. Hence, the common A an 
appears to be defined by such items as t com 
46 (how do you personally feel abou unio? 
pany’s relations with union, and ee a 
have too much power). Next are i m 38 
39, 40 and 47, followed closely by ite™ ~~ 


ph 
Riles 


te 
; mpuUr® ogi 
* Since the data were dichotomized to CO e 5? | 


The following items did not scale: 


38. Do the foremen, in 


general, act toward the un- 
ion in the same way 


as top management? 
(1)——the foremen are vi 


along with 
(2)—the foremen are somew! 
along with 
(3)——the foremen are about the sam 
management 
(4)——the foremen are somewh 
to get along with 
(5)——the foremen are much mo: 
to get along with 
39. Does middle management, in general, act toward 
the union in the same way as top management? 
(1)——the middle management are ve 
easier to get along with 


ery much easier to get 
hat easier to get 
e as top 
at more difficult 


re difficult: to 


coefficients, significance was tested by finding g Si 
value of chi square for dj=1. This ye of Hi; 
responds to phi of .293. Copies of tal to M. 
coefficients may be obtained by writing jinois- 
Derber, University of Illinois, Urbana, 


ery much 


d 


Ba 


A 
K 


Scales for Union and Management Attitudes 


The remaining seven items have few signifi- 
TAN correlations either with the previously 
isted seven or among themselves. A deci- 
sion based solely on this set of data would 
undoubtedly have been to take the seven 
items with a majority of the significant posi- 
tive coefficients and use them as a scale, re- 
eing the remaining seven which show ran- 
om interrelationships. 

The second analysis was based on the Gutt- 
man procedure. An arbitrary score was Com- 
puted for each establishment by summing the 
values of the answers to the 14 items; and the 
establishments were ranked in order of favor- 
ableness toward the union. Responses to the 
he were then tallied in 14 columns, with 
abl item showing the largest number of favor- 
es e opinions at the left, and decreasing to- 
is the right. Cutting scores were then 
ot such that a maximum number of fa- 
oe responses to a given item would be in 
abl upper portion of the column, and unfavor- 
tion g sponses concentrated in the lower por- 
teh : By dropping items, recomputing scores, 
th anking establishments, etc., it was possible 
“oe the number of “errors” —unfavor- 
im above this arbitrary line, fa- 

responses below it. 

we the course of this scaling, Items 42 and 
si were dropped first. Item 42 had only one 
8nificant positive phi coefficient, and 45 had 

Senna significant negative phi, with no 
meth cant positives. The results of the two 
al ods thus far are closely comparable. In 
n phase of scaling Item 47 was also 
AE On phi analysis 47 had a large 
cann, er of significant coefficients (six). We 
nA ot explain why it proved impossible to 

Would suitable cutting point on 47 which 
Scor relate it suitably to the revised total 
e. Finally, Items 37 and 41 were kept in 

D e scale by combining them; i.<-, the choice 
or POTR on the two items had to add to 3 
O to be counted as favorable. Other- 
tind the combined response Was treated as 
Ae Thus, the Guttman scaling pro- 
cis, re gives us a scale using 10 items, six 

cor 8 in the list of seven having high inter- 
relations. Two items, 35 and 43, were 


5 
(a Fer Precise details of the Guttman method, see 


297 


found to scale adequately despite a modest 
number of significant correlations, and two 
others, 37 and 41, were saved by combining 
them to make one scoring unit, a procedure 
which is feasible with Guttman scaling but 
difficult to achieve by straight correlational 
methods.® 

Guttman determines whether he has a “true 
scale” by computing the “coefficient of repro- 
ducibility.” This is simply the number of re- 
sponses correctly located as favorable or un- 
favorable (see above) divided by the total 
number of responses. For the final manage- 
ment scale, this coefficient is 915, indicating 
the presence of a “true scale” in this case. 

Each item in a Guttman scale is scored as 
having a weight of one point. Thus, there is 
a possible range of scores from zero to nine. 
These are the scores which have been used in 
the substantive analysis of our data. 

2. Union scale. We may now consider the 
parallel scaling of the responses obtained from 
union respondents. It will be noted that the 
items presented to the union spokesmen were 
in general similar to those for management, 
but certain variations were necessary. 

The phi coefficient analysis for the 14 un- 
ion items indicates that the degree of internal 
coherence of attitudes here is somewhat less 
than for the company officials; only 20 co- 
efficients (out of 91) are positive and signifi- 
cant at the 5% level; again, one is negative 
and significant to this degree. 

The common thread in the union items 
seems best represented by Item 44 (what is 
top management attitude to union) and Item 
46 (does company abuse power), which have 
significant correlations with six other items. 
Close behind are 35, 36, and 40 (is top man- 
agement reasonable, has management shown 
understanding, top management interested in 
worker welfare), each of these having five sig- 
nificant correlations. Reasonably integrated 
with the others is Item 34 (how do you per- 
sonally feel about union relations with com- 
pany). One item, 38, had no significant co- 
efficients, and Items 41 and 42 had only one 
each. It would appear that a satisfactory 


RAC 
6 Guttman defends the procedure of contriving 


these combined items on the ground that a pattern 
of responses may provide information not given by 


an isolated response. 


298 


scale could be prepared with Items 44, 46, 35, 
36, 40, and 34, since 14 of the 20 significant 
positive coefficients are found within this 
group; only one of their intercorrelations was 
below the 5% cutting point. ; 
Turning now to the Guttman analysis, we 
find a situation similar to that in the manage- 
ment data. Four items were dropped at the 
first stage of the analysis: 38, 39, 42, and 45; 
collectively, these four items accounted for 
only three of the 20 significant positive coeffi- 
cients, and for the single negative coefficient 
which reached significance. It would appear, 
therefore, that items which correlate very 
poorly with others in the pool are fairly cer- 
tain to drop out early in a Guttman analysis. 
After rescoring and reranking the establish- 
ments, we found it necessary to drop one 
more item, 41; this item likewise has only a 
single significant correlation, The remaining 
nine items were shown to form a “true scale”; 
the coefficient of reproducibility was .902, It 
is interesting to note that the item with the 
fewest “errors” in the final scaling is 44, 
which was also best in the correlational data; 
but, on the other hand, Item 46, almost as 
good in the correlation matrix, has almost the 
highest number of errors in the Guttman pro- 
cedure. Since similar anomalies were obsery- 
able in the management data, we infer that a 
core of items established by intercorrelations 
will agree fairly well with those selected in 
the early stages of the Guttman method; but 
as further precision js sought, the correla- 


tional results do not predict well the final 
scaling of items, 


Guttman considers 
“ordinal” in character 
ing of scores he speaks 
ever, he notes that it 
data to treat everyon ki 
merical score as belo 


not be exactly 
We have followed this sug- 


and thus have treated 
ng m favorable answers 
ype m. Since the num- 
l, the number of cases 
sified by a different pro- 


as belonging to scale t 
ber of errors is smal 
which would be reclas 
cedure is very small, 


Ross Stagner, W. E. Chalmers, and Milton Derber 


Table 1 


Distribution of “Scale Types”* 


ion to 
aes PP gach 
b 
1 ï = 
3 1 
8 13 1 
7 11 : 
6 1 5 
5 3 5 
4 3 3 
3 0 I 
2 4 : 
1 2 3 
0 1 


favor- 
far the 


` ap ihesi 
a We have treated the scale type simply as o i 
able responses, and statistically have consi 
equivalent of a test score. 


b The maximum for the union scale is 9. 


is 
The final distribution of scale ypa Az 
shown in Table 1 for both scales. It Ww rep- 
noted that the entire attitudinal mne ante 
resented in our population of establis! ET 
on each scale; however, there is not aoe 
form spread along the scale as Guttman ‘a 
consider desirable. The piling up is ie 
“more favorable” categories; this is pre 
ably fortunate in terms of labor peace. , 
An additional technical note is mpri en- 
Ideally, in Guttman scaling, the gl ormly 
tries should be widely and fairly ee an 
spread; i.e., there should be a range veryon’ 
item answered favorably by almost e most 
to one which is answered unfavorably Ai þe- 
of the population, and items well spaced is 
tween these two. Our data do not m 
test; despite our choices of cutting-por Jarge 
items also pile up in the range wit 


tant. 


. situas 

is S! 
numbers of favorable answers. THS ious 
tion could not be corrected without The & 
increasing the number of “errors. 


rape 
RERE A ere 
Tors remaining in the final scales W 


domly distributed. ini CitY 

3. “Attitudinal climate.” In out Win fy- 
study, the data indicated that a commo" ps. 
tudinal climate” could be identified S pir 
tablishment. This conclusion was non work 
marily on the responses of rank-and4 aicatin’ 
ers (who were also union members); ini ciate 
that liking for the company was aS 


{ 


Scales for Union and Management Attitudes 


with liking the union, and vice versa. In a 
plant where the workers liked management, 
they seemed also favorable to the union; in 
firms where the management was disliked, the 
union also was in relative disfavor." How- 
ever, this relationship did not hold when we 
compared the attitudes of union and manage- 
ment leaders within the establishment. 

In our present study, it is possible to make 
another test of the “attitudinal climate” hy- 
pothesis, We have used reasonably parallel 
questions, asked of managers and of unionists, 
to define the attitude of each toward the 
oad If attitudinal climate generalizes to all 
‘dae of an establishment, we would expect 
nY firms in which management approves of 

union would also have a union which ap- 
eo. of management; i.e., there should be 
cant positive correlation between the 

o measured attitudes. 

Manis prediction is not confirmed. The cor- 
mon between the two attitudes, for our 41 
a ablishments, using the a priori scales based 
a the original 14 items in each questionnaire, 
m 271. This verged on significance, and it 
et that the refined scales would cor- 
5 be more highly. Just the reverse proved 
Sal e the case. The final scales correlated 
es oe each other, a result which 
ae indicate that there is no connection, 
ag perhaps in very extreme cases, be- 
dis en the two attitudes. We are therefore 
cli ee to conclude that, while attitudinal 
BA eas may be a functional reality at the 
th p of the rank and file, the relationship at 
R evel of the leaders is too complex to be 
o med in a simple formulation of this 


Form and Dan- 
that active union 
eir departments (“as 
place to work”) more than inactive members. 

dicates that there may 


tili nagement 
ility to the union, and similarly on the part of 
these factors 


See: : 
ine to give rise to “types” or pate: 
which are repeated several times ¥ 


sing 2 cutting point 

those e eal separately and treating as @ single type 

of hi E ablishments which had certain combinations 
gh or low scores (see [1] for details) - 


299 


Discussion 


This report deals with a portion of the data 
from an extensive quantitative study of un- 
ion-management relations at the local level 
(1). A Guttman analysis (supported by cor- 
relational data) indicates that unidimensional 
scales have been established both for manage- 
ment attitudes to the union and for union 
attitudes toward management. 

The scales developed are strongly character- 
ized by what may be called a “halo” factor, 
or a generalized tendency to approve or dis- 
approve. The better items in the manage- 
ment scale were such as, “How do you per- 
sonally feel about the company’s relations 
with the union?” “Does the union have too 
much power?” However, this does not mean 
the respondent approved (or disapproved) of 
everything associated with the union. Highly 
specific items failed to scale; e.g., “Are the in- 
ternational union representatives skillful bar- 
gainers?” The same tendency (for general- 
ized items to scale, and for specific items not 
to scale) occurs with the union attitudes. It 
would appear that a Guttman scale may be 
simply an elaborate and highly reliable way 
to ask, “Do you approve of the union?” 

Such scales can, of course, be of consider- 
able value in research which involves quantita- 
tive estimates of union or management atti- 
tudes. It is to be hoped that the scales here 
reported will be seen by other students of in- 
dustrial relations as valuable devices, to be 
incorporated in their own research designs. 
This seems to be a path by which progress in 
this field can be greatly accelerated. 


Summary 


1. Fourteen multiple-choice items, relating 
to the attitude of management toward the un- 
ion, were presented to the two top labor-rela- 
tions officials of 41 Illinois companies. Cor- 
relational analysis indicated the presence of 
a single factor, or unitary attitudinal dimen- 
sion, involving at least seven items. Guttman 
analysis confirmed this, and produced a “true 
scale” consisting of nine of the original items 
plus one contrived item combining two ques- 
tions. 

2. A parallel analysis of responses from the 
union officials in these same establishments 


300 Ross Stagner, W. E. Chalmers, and Milton Derber 


led to similar conclusions and to a Guttman 
scale of nine of the original items. 

3. Attitude of management toward union is 
uncorrelated with attitude of union toward 
management, when measured by these refined 
scales. 

4. The refined scales are valuable instru- 
ments for substantive research on union-man- 
agement relations at the plant level. 


Received January 17, 1958. 


References 


1. Derber, M., Chalmers, W. E., & Stagner, R. Uni- 
formities and differences in local union-man- 


agement relationships. Indus. Labor Rel. Rev., 
1957, 11 (Oct.), 56-71. 

2. Form, W. H., & Dansereau, H. K. Union mem- 
ber orientation and patterns of social integra- 
tion. Indus. Labor Rel. Rev., 1957, 11 (Oct.), 
3-12. 

. Guttman, L. The basis for scalogram analysis. 
In: S. A. Stouffer et al., Measurement and 
prediction. Princeton, N. J.: Princeton Uni- 
ver. Press, 1950. Pp. 60-90. 2 

4. Institute of Labor and Industrial Relations, Uni- 
versity of Illinois. Labor-management rela- 
tions in Illini City. Vol. 1, Case studies. Vol. 
2, Exploration in comparative analysis. Cham- 
paign, Ill.: Author, 1953, 1954. 

. Williams, L. K. Scaling of interview data on two 
union-management relationships. M.A. thesis, 
Univer. Illinois, 1954. 


we 


ow 


| 


Tonia of Appts 
Vol. 42, NAb ts ited Rsychology 


The Attitudes of West Texas College Students Toward School 
Integration 


Herbert Greenberg and Dolores Hutto 


Texas Technological College 


Bone the recent Supreme Court decision 
KIA segregation in schools b unconstitu- 
ie > a is high in the question of school 
cite ion. Much has been said about the 
apid ot of integration if effected too 

as ee and on the other side, the pressure 
is the en great to desegregate just as quickly 
ed physical change could be effected. Few 
ical have been made, however, to scien- 
ica. ed determine the attitudes of the stu- 
ts 1emselves concerning integration. Thus, 
a re author and his associates conducted 
stude y (2) of Negro and white high school 
Tiest nts in the West Texas area. A logical 
stud step it seemed was to conduct a similar 
cial, on the college level. This seemed espe- 

ine appropriate after the incidents in Ala- 
the Y and the contrary smooth integration at 

niversity of North Carolina. 


Problem 


a ee determine the attitudes of students in 

ee West Texas college toward school in- 
is i lon in general and to specific areas of 
2 Integration 

Ranga determine whether there 
ss a between authoritarianism 

oward school integration 

ea determine whether classif 

inte, Is related to authoritarianism Or to 
Station attitudes 


is a rela- 
and atti- 


fication in 


Procedures 


Po 5 
to Pulation. A battery of tests 
Lubh students of Texas Technolo 


Matel 
Selecteq 


was administered 
gical College of 
esents approxi- 
Jation and was 


on the basis of an exact breakdown as to 
; Ep engineering, liberal arts, etc-; classifica- 
© supe Sex. The tests were administered under 

to seve tvision of the senior author by administrators 
ata ae elementary psychology classes. The personal 
Classific: te then checked to determine the divisional, 
x pla ee and sex breakdown obtained from the 
- Where the test sample deviated from correct 


301 


representation of the universe, ie. student population 
of Texas Tech, cases were eliminated from the over- 
represented groups till the tested population did ac- 
curately represent the universe. This could success- 
fully be done for sex and divisional breakdown, but 
would have required an overly severe paring in order 
to create a proportionately representative group, from 
the senior and graduate classes, as comparatively few 
of these students could be found in elementary psy- 
chology. To make up this deficiency, students were 
approached at random in the cafeteria and asked 
their classification and division, and those students 
who fitted the need were asked if they would co- 
operate by filling out a short questionnaire. Agree- 
ment was received in every case, So that the problem 
of selective sampling need not be considered. Through 
this method, the exact categorical breakdown dis- 
cussed above could be achieved while allowing the 
bulk of the sampling to be derived from group test- 
ing in the seven psychology classes. 

The test battery. The California F Scale, the In- 
tegration Attitude Scale and a personal data form 
were employed. The California F Scale (1) was 
designed to measure authoritarianism in personality 
through a series of 29 statements to which the re- 
spondent reacts by strongly agreeing, agreeing, mildly 
agreeing, strongly disagreeing, and mildly disagreeing. 
An example item is “Familiarity breeds contempt.” 

The Integration Attitude Scale (2), “IA,” was de- 
signed to measure attitudes toward school integration 
in general and specific areas in integration in par- 
ticular. Thus, it is the purpose of this instrument 
not merely to obtain over-all integration attitudes, 
but to determine integration attitudes as applied to 
sports, classrooms, eating facilities, etc. For the 
purposes of this present study, question Number 24, 
“I do not think I would be willing to sit next to a 
member of another race in class,” was omitted and 
was replaced by, “J would not accept a member of 
the other race as a roommate in the dormitory.” 

The personal data form was designed to secure 
personal-social data so as to permit the establishment 
of comparison groups between the classification, 
sexes, and divisions leaving integration attitudes and 
authoritarianism as variables. 
Procedures in treating data. Means, standard devia- 
tions were found for each classification, division, and 

Critical ratios were then obtained 


sex for each scale. 

between the means of each of these breakdowns for 
each scale. Thus, freshmen F-scale means were com- 
pared with those of the sophomores, juniors, seniors, 
and graduates while the same procedure was fol- 
lowed with the IA. Likewise, comparisons were 


302 
Table 1 
Product-Moment Correlation, IA to F 
Product- 

Moment . 

Correlation N 
Freshmen 0.215 134 
Sophomores 0.167 124 
Juniors 0.15 107 
Seniors 0.289 68 
Graduates 0.518 23 
Total Population 0.213 456 


Herbert Greenberg and Dolores Hutio 


Table 2 


Means and Standard Deviations for the Several 
Classifications for Both Scales 


Classifi- Mean SD Mean SD r 

cation F F IA IA N 
Freshmen 108.83 20.008 91.78 41.09 134 
Sophomores 108.59 21.66 94.63 11.27 124 
Juniors 113.26 21.64 93.63 37.82 107 
Seniors 103.7 22.3 98.13 40.58 68 


Graduates 105 17.44 97.39 42.5 23 


made between males and females for both scales and 
between the several divisions. 

Lastly, an item analysis of the IA Scale was con- 
ducted in which the number of positive and negative 
responses for each item was determined for each 
classification, division, sex, as well as for the total 
population. This was done to ascertain group at- 
titudes in specific areas of integration. 


Results 


Table 1 shows product-moment correlations 
for the classifications and for the total popula- 
tion IA to F. It can be seen that there is a 
very low correlation between the two scales, 
the only significant one being the +0.52 for 
the graduates. It should be noted further 
that the interscale correlation computed for 
the several divisions and for each sex yielded 
approximately the same results as for the total 


population. Thus, they are not presented here 
in separate tables. 

Tables 2 and 3 show means and standard 
deviations for each classification and the criti- 
cal ratios between designated comparison clas- 
sifications. It should be noted that a high 
score designates greater authoritarianism an 
more negative attitudes toward integration. 
Thus, the freshmen mean of 108.83 on the 
scale makes that group significantly more 
authoritarian than the seniors, having a mean 
of 103.7. Conversely, on the IA scale the 
seniors, with a mean of 98.13, have signif 
cantly more negative attitudes toward schoo 
integration than the freshmen, whose mean 15 
the lower score of 91.78. It should be noted 
further that, as in the case of Table 1, there 


Table 3 
Critical Ratios Between the Means of the Several Classifications for Both Scales eS 
Classification CRF? CR. “IA” 
Freshmen/Sophomores 29 1.92 
05% level 
Sophomores/Juniors 1.66 1.87 
05% level 
Juniors/Seniors 2,97 3.41 
01% level 01% level 
Freshmen/Seniors 1.73 5.26 
05% level 001% level 
Freshmen/Juniors 172 1.62 
05% level 
Sophomores/Seniors 1.88 2.83 
-05% level 01% level 


Wy 


— 


Attitudes Toward School Integration 


was no significant difference between the 
means of each sex and between the several 
divisional means. Thus, they are not pre- 
sented separately. 

In most areas the student attitude toward 
integration is quite positive. For the most 
part, however, the more advanced classes ap- 
pear less positive than the freshmen and sopho- 
mores. It can be further noted that in the 
areas of closer and closer personal-social con- 
tact the relative positiveness of the answer 
decreases until such areas as dating (Item 20, 
“I believe that dating between races will be 
a serious problem soon after integration,” and 
Item 21, “I would not mind ‘double dating’ 
with a couple both of whom were of the other 
race”) and dancing (Item 12, “Different racial 
groups mixing at school functions [dances, 
parties, etc.] will not be wise—it will only 
result in fights and ill feeling between races”) 
a majority of the students expressed negative 
attitudes. 

In many items, on the other hand, students 
in all classifications expressed overwhelmingly 
positive attitudes. For example, in Item 1, 
“Tf another race was integrated into my school, 
I would do my best to accept them as class- 
mates and equals,” over 90 per cent of the 
respondents expressed agreement. Likewise, 
in questions pertaining to integration in class- 
rooms, athletics, and even fraternities, a large 
percentage of all classifications expressed favor- 
able integration attitudes. In 22 out of the 
29 “IA” items, a majority of all classifications 
expressed favorable integration attitudes, while 
in most of the remaining 7 items the several 
Classifications were divided in attitudes, in 
general, the higher classifications as indicated 
above having the less favorable attitudes. 

An interesting result of this study also is 
that the pattern is almost identical to that 
shown by the high school group in our earlier 
study (2). The high school group have con- 
siderably higher authoritarian attitudes than 
the college group and somewhat higher scores 
on the IA scale, but the clear pattern of posi- 
tive attitudes with the decreasing positiveness 
in the close personal-social areas is present for 


both groups. 


303 


Conclusions 


1. There is apparently little direct rela- 
tionship between authoritarianism and nega- 
tive integration attitudes for this population. 
The integration attitudes in the South are ap- 
parently so much a part of the culture that 
they are separated from the more generalized 
attitudes covered by authoritarianism. This 
hypothesis is lent additional weight by the 
fact that only in the graduate group some 
correlation was found. This, of course, is the 
group that through maturity and education 
may have gained insight into the relationship 
of the attitudes. 

2. In general, it may be concluded that 
there is a generally positive attitude toward 
integration on the part of the white college 
student in West Texas, thus easing the widely 
expressed fear of serious trouble when inte- 
gration comes. This conclusion is based on 
two facts. First, that the IA Scale means for 
all classifications lie well below a potential 
middle score of 116. This fact indicates that 
far more than 50% of the responses were 
those positive toward integration. This fact 
alone, however, would not be sufficient to 
justify this conclusion. The additional factor 
of the actual numbers of items to which an 
overwhelmingly positive response, as opposed 
to negative responses, Was made adds sub- 
stantial weight to the above statement. It 
should be noted that in 22 out of the 29 IA 
items the positive integration attitudes out- 
number the negative ones in all five classifica- 
tions, while in most of the remaining 7 the 
classes are divided in their response. Further, 
in many items there is better than a two to 
one ratio of positive to negative response. 

3. Problems may be looked for in the areas 
of close personal-social contact such as socials, 
parties, dances and the like. 

4. There is apparently a discrepancy be- 
tween student and parental attitudes. Thus, 
where about 90 per cent of the students say 
they would do their best to accept members of 
the other race as classmates and equals and 


304 Herbert Greenberg and Dolores Hutto 


the majority would accept them into clubs References 
and even as good friends (Item 15), a definite 1. Adorno, T. W., Frenkel-Brunswik, Else, Levinson, 
majority would not bring these classmates and D. J, & Sanford, R. N. The authoritarian 


personality. New York: Harper, 1950. 


even good friends home with them due to fear , Grecnbers, H Chase, A L. & C TT 
s Aeh 3 E] ase, À. L., & Cannon, . ` rr. 
of parental disapproval (Item 8). Attitudes of white and Negro high school ‘stu- 
n ý 5 dents in a west Texas town toward school in- 
Received April 8, 1957. tegration. J. appl. Psychol., 1957, 41, 27-31. f 


| 


Te-a 


| 


Š 


Journal oj Appli , 
{ournal of Applied Psychology 


Anxiety Level and Score on 


a Biographical Inventory * 


Donald H. Kausler and E. Philip Trapp 


University of Arkansas 


Siegel (5, 6) has recently developed a bio- 
graphical inventory, The Biographical Inven- 
tory for Students (BIS), which shows promise 
as a predictor test, particularly in the area of 
acddemic achievement. He found, for exam- 
ple, that four of the 10 subscales, Action, 
Heterosexual Activities, Political Activities, 
and Dependence upon the Home, correlated 
significantly with college grade point average. 

The relationship between scores on the De- 
pendence upon the Home subscale (Dep) and 
grade point was curvilinear, with the higher 
grades corresponding to the extreme positions 
on the Dep continuum. Ina follow-up study, 
Kausler and Little (4), relating Dep scores 
with grades in psychology courses, obtained 
a similar curve. 

Siegel, in discussing the meaning of this 
curvilinear relationship, suggests that the Ss 
in the middle of the Dep range are more in- 
secure than the Ss located at the two ex- 
tremes. This difference in security accounts 
primarily for the difference reflected in aca- 
demic performance. 

The present authors 
Siegel’s findings in terms 0 
defined motivational variable. The curvilinear 
relationship suggests the operation of varying 
levels of anxiety drive at varying levels of 
dependency, In other words, the Dep sub- 
scale may, in effect, be measuring a motiva- 
tional source of individual differences. 


This study is directed at investigating the 
suggested relationship between the Dep sub- 
It is hypothesized 


scale and anxiety level. 

that: (a) the Ss scoring at the two extremes 
on the Dep subscale of the BIS will differ 
significantly in anxiety level from Ss scoring 
in the intermediate range; and (b) the Ss 
scoring at the two extremes on the Dep sub- 
scale will reflect lower anxiety level than the 
Ss scoring in the intermediate range. The 


prefer to interpret 
f a more clearly 


1The authors wish to thank Laurence Siegel ay 
Educational Testing Service for their permission to 


use the BIS in this study. 
305 


hypothesized direction of the differences be- 
tween the two groups follows Farber’s sug- 
gestion (1) that performance level is appar- 
ently inversely related to anxiety drive level 
on complex learning problems. 


Method 


Subjects. Since only the BIS form for men is 
currently available, the Ss were 64 male students, 
predominantly freshmen and sophomores, enrolled in 
general psychology at the University of Arkansas. 

Procedure. The entire BIS was administered with 
standard instructions to the Ss during a regular class 
period, though only the Dep subscale was used in 
this study. The Ss were informed that the objective 
of the inventory was to compare biographical infor- 
mation of students at the University of Arkansas 
with students in other universities. 

Approximately one month later, the Taylor Mani- 
fest Anxiety Scale (7) was administered to the Ss, 
again during a regular class period. The scale, titled 
Biographical Inventory, was described to the Ss as 
an additional source of biographical information for 
comparing students at Arkansas with students else- 
where. The standard scoring procedures were em- 


ployed. 
Results 


The Ss were divided into two groups on the 
basis of their scores on the Dep subscale. The 
scores ranged from — 2 to + 20, with a me- 
dian of 9. The Ss in the first and fourth 
quartiles formed the extreme group (N =32), 
and the Ss in the second and third quartiles 
formed the intermediate group (V = 32). 
The Ss were further classified according to 
their level of “manifest anxiety,” applying 
the method usually followed in studies with 
the Taylor scale. The upper 25%, based on 
the distribution of scores for all 64 Ss, con- 
stituted a high “manifest anxiety” group (HA, 
N = 16), the lower 25% a low “manifest anx- 
iety” group (LA, N= 16), and the middle 
50% a middle “manifest anxiety” group (MA, 
N = 32). 

Since both the Taylor scale and the BIS 
are ordinal scales, the hypotheses investigated 
were tested by a nonparametric test, x. A 


I] 
: Reagearch |! 


p 


VO 4 


306 


Table 1 


Dep Score and Anxiety Level: Observed and 
Expected Frequencies 


Extreme Dep Intermediate Dep 
8 8 
mo 2 14 
16 16 
MA 2 tt 
8 8 
LA 9 j 


Note.—x? = 12.36 (p < .01). 


3 X 2 contingency table was established com- 
prising two groups, Extreme Dep (high and 
low scores on Dep) and Intermediate Dep 
(intermediate scores on Dep), and three clas- 
sifications, High Anxiety (HA), Middle Anx- 
iety (MA), and Low Anxiety (LA). The re- 
sults of this analysis are presented in Table 1 
below. 

An examination of Table 1 reveals a large 
difference in “manifest anxiety” between the 
Extreme Dep and Intermediate Dep Groups 
and in the hypothesized direction. The Inter- 
mediate Group contains a preponderance of 
HA Ss, while the Extreme Group contains a 
high proportion of MA Ss. The overall y? is 


12.36, which is significant at the .01 level of 
confidence. 


Discussion 


Successful performance on complex tasks, 
such as college courses, is dependent upon 
interactions among a mult: 

and motivational factors, 
graphical information in 
ance on such tasks, therefore, depends upon 
the extent to which this information measures 
one or more of these factors. The biographi- 
cal information derived from such scales as 
the BIS is interpreted by the authors as tap- 
ping primarily the motivational area. That 
is, certain facets of the individual’s motiva- 
tional system, depending, of course, on the 
content of the biographical data, express 
themselves through his personal history. The 


iplicity of aptitude 
The efficacy of bio- 
predicting perform- 


Donald H. Kausler and E. Philip Trapp 


present study substantiates this interpretation 
in demonstrating a relationship between scores 
on the Dep subscale of the BIS and the mo- 
tivational variable of anxiety as measured by 
the Taylor scale. However, a more rigorous 
interpretation of this relationship in terms of 
drive level is restricted by recent criticisms 
(2, 3) of the Taylor scale as a measure of 
drive. 

The results of the present study also ap- 
pear consistent with common clinical evidence 
on manifest anxiety. An individual’s anxiety, 
ordinarily, should be more controlled or re- 
duced when competition between his depend- 
ency and independency needs is low rather 
than high. Individuals scoring at the extreme 
positions on the Dep subscale are likely to 
experience relatively low competition between 
their dependency and independency needs be- 
cause of marked subjugation or suppression of 
one need by the other. On the other hand, 
individuals scoring in the middle range, re- 
flecting a closer balance of strength between 
their dependency and independency needs, 
are more likely to experience greater inter- 
need conflict. The outcome would be evident 
in higher manifest anxiety. 


Summary 


The following two hypotheses were tested 
in the present study: (a) the Ss scoring at 
the two extremes on the Dep subscale of the 
BIS will differ significantly in anxiety level 
from Ss scoring in the intermediate range; 
and (b) Ss scoring in the intermediate range 
will reflect higher anxiety level, Anxiety level 
was measured by the Taylor Manifest Anx- 
iety Scale. Sixty-four male students at the 
University of Arkansas served as Ss. In the 
analysis of results, a 3 x 2 contingency table 
was constructed comprising two groups, EXx- 
treme Dep and Intermediate Dep, and three 
classifications, High Anxiety, Middle Anxiety, 
and Low Anxiety. The over-all y? was 12.36, 
significant at the .01 level of confidence, and 
in the hypothesized direction. ‘The findings 
Support the interpretation that the Dep sub- 
scale of the BIS is measuring the motivational 
variable of anxiety. 


Received November 4, 1957, 


Anxiety Level and Score on a Biographical Inventory 


References 


1. Farber, I. E. The role of motivation in verbal 
learning and performance. Psychol. Bull., 
1955, 52, 311-327. 

2. Hill, Winfred F. Comments on Taylor’s “Drive 
theory and manifest anxiety.” Psychol. Bull, 
1957, 54, 490-493. 

3. Jessor, R, & Hammond, K. R. Construct Va- 
lidity and the Taylor Anxiety Scale. Psychol. 
Bull., 1957, 54, 161-170. 

4. Kausler, D. H., & Little, N. D. The BIS depend- 


307 


ency scale and grades in psychology courses. 
J. counsel. Psychol., 1957, 4, 223-224. 

5. Siegel, L. A biographical inventory for students: 
I. Construction and standardization of the in- 
strument. J. appl. Psychol., 1956, 40, 5-10. 

6. Siegel, L. A biographical inventory for students: 
II. Validation of the instrument. J. appl. 
Psychol., 1956, 40, 122-126. 

7. Taylor, Janet A. A personality scale of manifest 
anxiety. J. abnorm. soc. Psychol., 1953, 48, 
285-290. 


nal of Applied Psychology 
wae 42, No. 5, 1958 


Development of a Criterion of Research Productivity ? 


Robert E. Stoltz 


Southern Methodist University 


The purpose of this study was to develop a 
criterion of research productivity for a physi- 
cal science research organization where the 
usual, seemingly more objective criteria were 
not present or not applicable (i.e., number of 
publications). 


Method 


Subjects. The subjects in this study were all tech- 
nically trained Physical science research workers, 
holding at least a bachelor’s degree in one of the 
Physical science or engineering fields, The Ss were 


Engineering, Reactor Metallurgy, Petroleum Chem- 
istry, Ceramic Engineering, Nonferrous and Ferrous 


sons, of whom over one third might be considered to 
be technical Tesearch persons, The work of this in- 
stitute is devoted largely to contract research; how- 
ever, considerable time is given to pure, 
Projects. 

Development of the scale, 
supervisors Participated in a o 
interview with the investigator, 
visors represented 57% of the supervisors at this 
level and constituted a cross section of all divisions 
of the institute, The interviews were conducted along 
a modification of Flanagan’s Critical Incident tech- 


Each of 27 research 
ne and one-half hour 
This group of super- 


1 The investigator js indebted to the staff of the 
Battelle Memorial Institute for their Cooperation and 
assistance in conducting this study. 


308 


chologists were each asked to examine the items and | 
to group them into whatever categories they felt the 

items represented, The investigator and two of the 
eight Psychologists then reviewed the groupings that 
had been made, and finally decided on 15 categories 
into which the majority of the statements could be 
classed. Fifteen statements, or items, were selected i 
from each category as most representative of the | 
category by the three judges. Twenty-five other 

items, which could not be categorized with any agree- 


ferred to again ormation regarding 
the checklists may be found in an unpublished report 
by Stoltz (2), The items were reviewed again to de- 
termine whether or not they conformed to three re- 
quirements: (a) They must be observable by the 
Persons who were to use the checklists; that is, ad- 
ministrative supervisors; (b) The behaviors men- 
tioned in the items must be general enough to apply 


the present one, This was done so that in future" 
research the same items might be used on new popu- 
lations and make possible meaningful comparisons; 
(c) The general behavior mentioned in the item must 


Forty research supervisors and assistant research 
Supervisors described 80 research personnel using this 
checklist (entitled the Productive Behavior Check- 
list). It should be pointed out that there is some 
i tion of the results as it was found 
15 of the original informants as 
raters in this Phase of the study. Each rater was 
Wo specific individuals: the most 
least Productive man they knew. 
Twenty of the raters were instructed to rate the pro- 

Uctive man first, the remaining 20 raters were asked 
oductive man first. Each item was 


ems differentially due to 


Serial position in the checklist would tend to be 


randomly distributed, 


© checklists were then collected and the ratings 
for each item 


nanos Were converted into two indices. A ~ 
4Scrimination index (DI) 


_ posed of tetrads of items from the checklist, w 


A Criterion of Research Productivity 


item by determining the amount of overlap between 
ratings for the productive and the nonproductive per- 
sons. On a sample of 50 randomly selected items, 
these DI’s correlated .91 with ¢ tests of the signifi- 
cance of the difference between the means of the two 
groups. A preference index (PI) was determined for 
cach item by computing the mean rating given the 
item for both the productive and the nonproductive 
group. Two forced-choice type rating scales were 
constructed to use these items and their preference 
and discrimination indices. These scales were titled 
Research Behavior Description I and II (RBD I and 
RBD II). The items within each scale were com- 

vith 
3 o items with discrimination in- 
dices approximately 20 points greater than the dis- 
crimination indices of the remaining two, and with all 
items in the tetrad having nearly identical preference 
indices, RBD I contained 25 tetrads and RBD II 
contained 26 tetrads.? 

Administration of the scale. Twenty-six divisions 
of the organization were selected to furnish raters for 
the study. These divisions constitute the principal 
units of organization of the institute. A division is 
to a great extent a functionally independent unit. A 
division comes into existence to handle a particular 
type of research problem and remains a division so 
long as that area of research remains important. If 
research activities in the area become increased, or if 
a particular aspect of the original problem becomes 
important enough to demand its own organization, 
the division enlarges or subdivides. Divisions range 
in size from 12 to 15 technical persons to over 100 
technical persons. The division titles reflect the ac- 
tivity of the division. For example, some of the 
divisions covered in this study were Industrial Eco- 
Nomics, Physical Chemistry, Ceramics, Nonferrous 
Physical Metallurgy, Light Metals, and Physical 
Metallurgy. Each division was sent six copies of 
RBD I and six copies of RBD II. Each division was 
also given a list of three men who were to be rated. 
These men were selected by the members of the 
Personnel department on a random basis. The divi- 
sion supervisor was asked to sele 
to tate sach aman using! both RBD I and RBD IL. 
Following each rating the supervisor doing the rating 
Was to indicate on each scale his over-all evaluation 
of the man being rated on a five-point scale, where 
five was the highest rating possible. 

At the end of a three-week period, 139 RBD I and 
136 RBD II forms had been returned out of the 156 
copies of each form that had been distributed. It 
had previously been decided to use for analysis only 


those forms returned by this deadline. A score for 
f d for each form by 


each person rated was determine 

counting the number of times 2 rater had selected a 
igher discriminating item from each tetrad. An 
average RBD I rating was determined for each per- 
Son rated by averaging the RBD I scores from his 
two raters. An average RBD II rating was deter- 
BD I and II may be 


each tetrad having tw 


he Copies of the checklists or R 
ained from the author. 


ct two supervisors , 


309 


Table 1 


Research Behavior Description Reliabilities 
and Validities 


Coefficient Scale N r ? 
Interrater RBDI 62 31 >.05 
Reliability RBD II 59 .62 >.01 
RBD I avg. vs. 
RBD II avg. 56 .74 >.01 
Five-Point Rating 
Criterion 65 .70 >.01 
Validity RBD I 139 .48 >.01 
RBDII  _ 136 .60 >.01 
RBD I Average 62 .55 >.01 
RBD II Average 59 .74 >.01 


Note.—RBD averages were correlated with person's average 
rating on five-point criterion scale to determine validity. 


mined for each person in the same manner. An over- 
all evaluation was determined for each person rated 
by averaging the rating given him on the five-point 
scale by each of the two raters. Since in some cases 
there was only data from one rater the Ns in the 
accompanying table will not be identical. In addi- 
tion, the over-all ratings were quite negatively skewed 
and were consequently converted to normalized T 
scores. Intercorrelations between the possible com- 
binations were computed and are shown in Table 1. 


Results 
r reliability for RBD I was found 
Interrater reliability for RBD II 
e .62 and the reliability of the 
five-point scale, after nor- 
e .70. In view of the 


Interrate 
to be .31. 
was found to b 
over-all rating on the 
malizing, was found to b 
widely different reliability coefficient for the 
two forms an internal consistency item analy- 
sis was conducted to determine to what extent 
tetrad content influenced the coefficients. Al- 
though the tetrads had been randomly as- 
signed to the two forms it was found that 
when biserial correlations were computed be- 
tween tetrad scores and total scale scores, nine 
tetrads on RBD I and six tetrads on RBD II 
were found to be insignificantly related to 
total scale score. Moreover, four of the nine 
insignificant tetrads on RBD I had negative 
signs, while only one of the six insignificant 
tetrads on RBD TI had negative signs. It 
would appear that the low reliability of RBD 
Į is partially accounted for by its containing 
more poorly constructed tetrads. This view 


310 


is supported by the fact that when the biserial 
correlations were computed between the tet- 
rads on RBD I and the over-all five-point 
rating, essentially the same nine items proved 
to be insignificantly related to the over-all 
Ra evidence of the reliability of the 
scale was gained by determining the product 
moment correlation between the average RBD 
I and average RBD II ratings on each person 
for whom there are four ratings. This corre- 
lation was found to be .74. 

The validity of the scales was assessed in 
several ways. First, each person’s rating was 
correlated with his over-all rating on each 
form. RBD I was found to have a validity 
coefficient of .48 and RBD II to have a coef- 
ficient of .60. When the person’s average rat- 
ing on each form was correlated with his aver- 
age over-all rating on the five-point scale, 
RBD I showed a coefficient of .55 while RBD 
II showed a correlation of .74. 


Discussion 


RBD II showed quite satisfactory inter- 
rater reliability and validity. RBD I did not 
show adequate interrater reliability, but this 
was no doubt due to the fact that this form 
contained a much greater number of insig- 
nificant, no doubt poorly constructed, tetrads 
than did RBD II. 

Validity coefficient based on the average 
ratings of two raters showed increased validity 
over either scale alone as one might expect. 
The correlation between the average ratings 
on the two scales indicates that even with the 
unreliability of RBD I, the two scales are 
measuring much the same thing. 

It appears that it is possible to develop a 
forced-choice rating scale to determine the 
extent of a person’s productive research be- 
havior, at least in so far as it is evaluated by 
his superior. The extent to which ratings on 
the RBD scales correlate with other measures 
of research productivity is not yet known. 
Neither is it known to what extent subordi- 
nates’ evaluation of productivity might agree 
with superiors’ evaluations. All of these prob- 


Robert E. Stoltz 


lems are yet to be answered by future ap- 
plications of the scales and their derivatives. 
Perhaps the most interesting fact of the 
entire study is that the rating scales predicted 
as well as they did when one considers the 
heterogeneous character of the divisions used 
in the study. It would appear to indicate that 
if the rating forms are evaluating productive 
behavior, the same type of behavior is given 
merit in one division as in another even 
though the activities of the divisions may be 
quiet different. There appear to be two pos- 
sible, perhaps related, explanations. One, 
that a person’s productivity is influenced most 
by characteristic work habits or attitudes that 
the person shows in all challenging activities 
and is somewhat independent of job knowl- 
edge. Or two, that supervisors are all re- 
sponding to a stereotype of the productive per- 
son that they reward in each division and 
which may have nothing to do with over-all 
productivity in a more objective sense. The 
present study does not indicate which of these 
two interpretations is correct. One would 
prefer the former explanation if less concern 
is in developing and understanding the pro- 
ductive process; but if our concern is predict- 
ing the ability of a person to rise in company 


esteem and status, the latter is of just as much 
benefit. 


Summary 


A forced-choice rating scale designed to de- 
termine the extent of a person’s productive 
research behavior was developed at a Mid- 
western physical science foundation. Of the 
two experimental scales developed the better 
form showed an interrater reliability coef- 
ficient of .62 and a validity of .60, When the 
ratings of two raters were averaged the validity 
of the scale increased to .74, 


Received December 16, 1957. 


References 
1. Flanagan, J. C. The critical incident technique: 
Psych. Bull., 1954, 51, 327-358. 
2. Stoltz, R. E. A study of productivity in a re- 


search setting. Unpublished doctoral disserta- 
tion, Ohio State Univer., 1956, 


A 
i, 
q 


event 


Journal of Applied y 
Wok 92, i 9s 


Performance as a Function of Control-Display Relations, Positions 
of the Operator, and Locations of the Control * 


Michael Humphries * 


University of Toronto, Canada 


i The ease with which a particular combina- 
tion of a control and a display can be learned 
and operated is frequently determined by the 
nature of the control-display movement rela- 
tion employed. Operators appear to expect, 
on the basis of prior experience, that certain 
control movements should produce certain 
specific display changes (15). If these move- 
ment relationships are not present, perform- 
ance is frequently impaired. These effects 
have been demonstrated on the Two-Hand 
coordinator (4, 17, 23, 25, 29, 32), and Turret 
Pursuit Apparatus (17, 21, 22, 33), the Mash- 
burn Apparatus (1, 2, 10, 17, 18, 19, 20, 26), 
the Toronto Complex Coordinator (15, 31), 
and for a variety of levers (6, 11, 13, 19, 24), 
rudder pedals (13, 14, 24) and knobs (7, 8, 
12). The basic conclusion from these experi- 
ments is that the movement of the control 
should be in the same plane and in the same 
direction as the resulting display movement 
in order to obtain the highest performance 
level and the fastest learning (5, 16). 
Although the directions and planes of move- 
ment of the two elements should be consistent 
in most cases, there are a few situations in 
which this generalization does not apply. In 
systems employing cranks or knobs it is dif- 
ficult to define consistency since rotary move- 
ment of a crank or knob is converted into 
linear movement on the display. This is the 
Case on most two-hand coordinators. Also, 
in some simple tracking tasks using levers 
and where the operator does not work under 
stress conditions, the principle of consistency 
does not apply; controls can be connected to 
—— eet 
i yi at the University 0: 
eronts Ss nig ERA eT NG, 9401-02 from the 
efence Research Board of Canada. s 
This paper represents part of a dissertation sub- 


mitted to tbe University of Toronto in 1957 in partial 


fulfillment of the requirements for the Ph.D. degree. 
the assistance of 


he writer wishes to acknowledge 
A. H, Shephard of the University of Toronto. 
2Now with the Defence Research Medical Labora- 


tories, Toronto, Canada. 
311 


the display so that the movement of the con- 
trol is inconsistent with display movement 
without impairing performance level (9, 27, 
28, 34). In view of these limitations the re- 
mainder of the present work should be con- 
sidered in terms of the situations where only 
relatively complex tasks are used and where 
linear control movement results in linear dis- 
play change. 

It has also been demonstrated that the posi- 
tion of the operator relative to the display is 
an important factor in determining the level 
of performance. In an experiment soon to be 
reported, the operator was seated to one side 
of the display so that his frontal plane was 
at right angles to the frontal plane of the 
display. Under these conditions, with the 
control in front of him the operator had to 
turn his head slightly to the left in order to 
see the display. It was found that this body 
position greatly improved the level of per- 
formance obtained from a control-display rela- 
tionship which normally leads to an extremely 
low level of performance when the S is seated 
directly in front of the display (15). This 
suggested that there might be an interaction 
between the effects of body orientation and 
control-display relations. Since there was 
neither a theory nor sufficient empirical evi- 
dence to suggest what the hypothesized inter- 
action might be, a model was constructed for 
this purpose. 

The model was designed to predict the ef- 
fect on performance of variations in control 
position, body orientation and control-display 
relations for the Toronto Complex Coordi- 
nator (31). Since the model was not com- 
pletely successful and as it is not possible to 
describe it in a brief paper, only the empirical 
findings will be reported at this time. 


Method 


Apparatus 


The apparatus use 
Coordinator (TCC). 


d was the Toronto Complex 
This unit has been described 


312 Michael Humphries 


in the literature (15, 31) i, raed only a brief 
ipti ill be given in thi: k : 
| Sis ested before a vertical panel of 81 ini 
assemblies, each consisting of an inner green ane an 
an outer red ring. These lights are covered a an 
opaque glass screen. Only one green disc o one 
red ring can be seen at a time. Position of the ring 
is determined by a stepping switch; that of the green 
disc by the position of the S’s control. The S is re- 
quired to move an airplane-type control stick so 
that the green disc within the ring is illuminated. 
When this has been accomplished, a new ring, or 
target, appears elsewhere on the screen. The task 
in this case is discrete, self-paced tracking. 

The measure of performance used was the number 
of matches made in a five-minute practice period. 
A match is recorded each time the S places the green 
disc within the illuminated target. 


Experimental Variables 


Three classes of independent variables were treated 
in the present work: control position, body orienta- 
tion, and control-display relationships. These three 
factors are discussed below. 

1. Control Positions: The majority of research on 
the TCC has been done under conditions where the 
control stick is perpendicular to the floor, This 
means that there can be no consistency between con- 
trol movement and vertical changes on the display; 
the operator can move the control in the horizontal 
and normal directions, but not in the vertical. For 
example, the “expected” or “standard” control-display 
relationship has required the operator to push the 
stick away from himself to make the green light 
move up the display (15, 31). 

In a number of experiments (5) it has been demon- 
strated that the planes of movement of the control 
and the display should be the same to produce the 
highest level of performance. Consequently it was 
thought that the best level of performance would be 
obtained on the TCC if the control-unit were ar- 
ranged so that both the vertical and the horizontal 
movement relations of the control and the display 
were the same, For this reason the control unit was 
mounted on a stand so that the control stick pro- 
jected towards the S, parallel to the floor and per- 
pendicular to the frontal plane of the display. In 
this way the other variables to be discussed can be 
compared under two positions of the control, one of 
which permits partial consistency in the planes of 
movement, and the other which permits complete 
consistency. 

2. Body Orientation: As mentioned previously, 
body reorientation made a usually difficult task very 
much easier than it was when the operator sat di- 
rectly in front of the display, Since the orienta- 
tion of the S could differentially affect the various 
control-display relationships to be discussed later, it 
was decided that this factor should be systematically 
varied. The three body positions tested are il- 
lustrated in Fig. 1. These positions have been desig- 


nated according to the clock-face numerals, 3, 6, 
and 9. 

3. Control-Display Relationships: The nature of 
the electrical connection between the control and the 
display of the TCC makes it possible to arrange a 
large number of control-display relationships. Any 
given arrangement, or task, can be defined by specify- 
ing the relationship between control movement and 
display change, where these directions are defined in 
terms of the display itself, regardless of the position 
of the control unit or of the operator. Although it 


DISPLAY 


CONTROL 


20 O > 


12) 


Fic. 1. Plan view showing the position of the con- 
trol, display, and three positions of the operator. 


is possible to arrange a number of tasks, only four 
are to be used in the present work. To be consistent 
with previously employed terminology, these four 
tasks have been defined in Table 1. In the column 
of spatial relationships, the first letter of each pair 
refers to the direction of control movement relative 
to the display, while the second in each pair refers 
to the direction of movement of the green light on 
the display. The letters A, T, R, L, U, and D refer 
to the directions, away, towards, right, left, UP» and 
down, respectively, For example, in Task A, ee 
the stick is moved away from the display (A) the 
green light moves down the panel (D). 


Table 1 


Control-Display Relationships for Four Tasks on the 
TCC with the Control-Stick Mounted 
in the Vertical Position 


Task Spatial Relations 
A AD TU RR LL 
B A-U TD R-L L-R 
c A-R TL R-U I-D 
D A-L T-R R-D L-U 


Control-Display Relations 313 


Table 2 


Control-Display Relationships for Four Tasks on the 
TCC with the Control-Stick Mounted in 
the Horizontal Position 


Task Spatial Relations 
A’ DD UU RR LL 
B’ D-U U-D R-L L-R 
c D-R U-L R-U L-D 
D D-L UR RD LU 


The above definitions of the four tasks do not 
apply when the control-unit is mounted on a stand 
due to the difference in vertical movement relations. 
In recognition of this fact, the four tasks with the 
control-unit on the stand are defined in Table 2. 
The meaning of the letters is the same as specified 
above, 


Experimental Design 


The experimental design was a 4 
(four tasks, three body positions, 
positions) with 24 independent cells, each cell con- 
sisting of 10 independent observations. This design 
permitted 23 orthogonal comparisons to be made. 


x3 X2 factorial 
and two control 


Procedure 


ven a standard set of verbal in- 


Each § was gi 
tion was demon- 


structions. The control-display rela 
strated and the S was shown how to make a match. 
All Ss were instructed to make as many matches as 
Possible, Each S worked continuously for five min- 
utes, on only one of the experimental conditions. 


Results 

ches made by each 
This figure shows 
k varies with the 


The mean number of mat 
group is shown in Fig. 2. 
how performance on each tas 
Position of the operator. 

From these curves a numb 
apparent: 

L. In Position 6: the levels of performance 
on Tasks C, D, C’, and D’ are similar. The 


levels of performance resulting from practice 
on Tasks B and B’ are slightly two-thirds the 
distance between the levels found for Tasks 
A and A’ on the one hand, and the levels for 
C, D, C’, and D’ on the other. Tasks B and 
B’ produce the same levels of performance; SO 
do Tasks A and A’. It is conclu ed, therefore, 
that the positions of the control used in this 
Study do not influence performance on these 

ed in Position 6. 


tasks when the operator is seat 


er of trends are 


2. In Positions 3 and 9: the levels of per- 
formance on Tasks A and B are lower in Posi- 
tions 3 and 9 than they are in Position 6, 
although the differences between the levels 
remain the same. It is concluded that the 
best performance on these tasks occurs when 
the operator sits facing the display. This is 
not the case for Tasks C and D, however. 
The performance levels for these tasks are in- 
creased by rotation of the operator, but the 
amount of improvement in each case depends 
on the direction of rotation. In Position 9, 
Task C appears to be easier than D, while in 
Position 3 the opposite is true. 

3. There is no apparent effect on perform- 
ance produced by rotation of the operator 
when he works on Tasks A’, B’, Gore's 
since these tasks have essentially the same 
level of performance in Positions 3 and 9 that 
they have in Position 6. 

Statistical analysis of the data illustrated in 
Fig. 2 supports the above generalizations. A 
summary of the analysis of variance is pre- 
sented in Table 3. 


MATCHES 
3 3 


8 


MEAN 


2 
i) 


BODY POSITIONS 


a function of control- 


Performance as 
erator and loca- 


Fic. 2. 
positions of the opi 


display relations, 
tions of the control. 


314 Michael Humphries 


Table 3 


: . ‘ c Positions of the 
‘ Analysis of Variance Table of the Effects of Four Tasks, Three 
porn aa! tae Operator and Two Positions of the Control 


oO ŘŘĖĖ 


Source df Mean Square F 
Body Position (D) 
DL (Linear) 1 4.2250 — 
DQ (Quadratic) 1 39.6750 — 
Stand vs. Floor (P) 1 47.7042 — 
Tasks (T) 
T’ ((A-+A’)-(B+B’)) 1 13,996.8000 34.53** 
T? ((C+C’)-(D+D’)) 1 91.8750 — 
T ((A+4’)-(B+B’))- 
((C+C’)-(D+D’)) 1 148,553.5042 366.49** 
DL P 1 354.0251 — 
DL T’ 1 12.0125 — 
DL T? 1 2,387.1125 5.89* 
DL T? 1 518.4000 — 
DQ P 1 69.0083 = 
Dq T’ 1 397.8375 -— 
Dq T? 1 57.0375 — 
Dq T? 1 73,112.0333 180.37** 
Pry 1 22.5333 aan 
PT? 1 185.0083 — 
PT? 1 143,521.5042 354.07** 
Di PT 1 255.6125 i 
Di PT? 1 1,320.3125 3.26 
Dy PT: 1 3,240.0000 7.99** 
Da PT’ 1 87.6042 — 
D; PT? 1 250.1042 — 
Da PT? 1 66,646.5333 164.42** 
Error 216 405.3458 
Total 239 
Ee oe a 
* Significant at the 5% level. 


** Significant at the 1% level, 


Discussion 


The results of the present experiment in- 
dicate that the difficulty of a perceptual-motor 
task on a given unit of apparatus is, in part, a 
function of the control-display relationship, 
the position of the control, and the position of 
the S with respect to the display. The effects 
on performance of various combinations of 
some values of these three Classes of variables 
indicate that these factors interact with one 
another. It was observed that some tasks are 
made more difficult while others are made 
easier by rotation of the S away from a posi- 


tion directly in front of the display. This in- 
dicates that knowledge of the control-display 
relationship alone is not sufficient to predict 
the relative levels of difficulty. 

Also, it is not possible to predict the effects 
of body rotation on the various tasks without 
considering the position of the control. 

These observations have serious implica- 
tions for equipment design. Under conditions 
Where the S regards the display over his 
shoulder, and perhaps also at other angles of 
regard, the best level of performance and the 
Sreatest facility in learning can be achieved 


i a ae 


mb 


p 


Control-Display Relations 315 


by using somewhat unconventional control- 
display relationships. This conclusion prob- 
ably holds only for certain types and positions 
of controls and displays. Consequently, until 
others have been tested the present results 
should not be used as the basis for general 


design principles. 


Summary 


Previous research has shown that control- 
display relations, position of the control and 
position of the operator are important factors 
determining the level of performance. Since 
these variables have usually been studied in 
isolation, an experiment was designed to in- 
vestigate the interaction between them. 

Twenty-four groups of male Ss practiced 
for five minutes on the Toronto Complex Co- 
ordinator. Each group worked on only one 
of the 24 combinations of experimental condi- 
tions. The results indicate that for the same 
apparatus, knowledge of the control-display 
relations alone is not sufficient to predict the 


relative levels of performance. 


Received December 30, 1957. 


References 


1. Adams, J. A. The evaluation of “difficulty of 
task” under different conditions of performance 
on the modified Mashburn Apparatus. ce 
of Naval Research, Special Devices Center, Port 
Washington, L. I. N. Y., 1949. (Tech. Rep. 
SDC 57-2-8.) 

2. Adams, J. A. The influence of the time interval 
after interpolated activity on psychomotor 
performance. Air Training Command, Hum. 
Resour. Res. Cent., Res- Bull. 55-11. San An- 


tonio, Texas: Lackland AFB, 1952. 
hy of perceptual-motor 


3. Andreas, B. G. Bibliograp 
performance u ied display-control rela- 
tionships. Rochester, N. Y.: Univer. Rochester, 
1953. (Contract AF 30(602) -200 Sci. Rep. 
No. 1.) 

4. Andreas, B. G., Gree 
Transfer effects pil 


n, R. F. & Spragg, S. D. S. 
n performance on a 
(modified SAM Two- 
st) and a compensatory 


task (modified SAM Two-Hand Pursuit Test). 
J. Psychol., 1954, 37; 173-183. 


5. Andreas, B. G., & Weiss 

search on perceptual-motor perform 
varied display-control relationships. 
N. Y.: Univer. Rochester 


A.F. 30(602)-200, Sci. Rep. No. 2. 


Rochester, 


10. 


11. Gardener, J. F. Direc 


12. Gibbs, C. B. Trans 


13. 


14. Grether, W. F. Directio 


15. Humphries, M., & Sheph 


16. Humphries, 


17. Lewis, D.» & McA 


6. Bilodeau, E. A. Modifications of directions of 
movement preference with independent varia- 
tion of two stimulus dimensions. Air Training 
Command, Hum. Resour. Res. Cent., Res. 
Bull, 51-12. San Antonio, Texas: Lackland 
AFB, 1951. 

_ Carter, L. F., & Murray, N. L. A study of the 
most effective relationships between selected 
control and indicator movements. In P. M. 
Fitts (Ed.), Psychological research on equip- 
ment design, Chap. 10. Washington, D. C.: 
U. S. Government Printing Office, 1947. (AAF 
Aviat. Psychol. Prog. Res. Rep. No. 19.) 


$. Fitts, P. M., & Simon, C. W. Some relations be- 


tween stimulus patterns and performance in a 
k. J. exp. Psychol., 


continuous dual-pursuit tas 
1952, 43, 428-436. 
9. Foxboro Company. A study of factors determin- 
ing accuracy of tracking by means of hand- 
wheel control. Report to the Services No. 71. 
Washington, D. C.: Division 7, National De- 
fence Research Committee, 1942. (Office of 
Scientific Res. and Develpm. Rep. No. 3451.) 
Gagne, R. M, Baker, Katherine E., & Wylie, 
Ruth C. The effects of an interfering task on 
the learning of a complex motor skill. Port 
Washington, L. I., N. Y.: Office of Naval Re- 
search, Special Devices Cent., 1949. (Tech. 


Rep. SDC 316-1-9.) 


~J 


tion of pointer motion in 
relation to movement of flight-controls—cross- 
pointer type instrument. USAF, Air Materiel 
Command, Dayton, Ohio, 1950. (AF Tech. 


Rep. No. 6016.) 
fer of training and skill as- 


sumptions in tracking tasks. Quart. J. exp. 


Psychol., 1951, 3, 99-110. 

Grether, W. F. Efficiency of several types of 
control movements in the performance of a 
simple compensatory pursuit task. In P. M. 
Fitts (Ed.), Psychological research on equip- 
ment design, Chap. 17. Washington, D. C: 
U. S. Government Printing Office, 1947. (AAF 
Aviat. Psychol. Prog. Res. Rep. No. 19.) 

n of control in relation 

to indicator movement in one-dimensional 

tracking. AAF Hdqrs, Air Materiel Com- 
mand, Engin. Div., 1947. (Memorandum Rep. 


TSEAA—694-4G.) 
ard, A. H. Performance 


lay arrangements as a 


on several control-disp 
7. Psychol, 1955, 9, 


function of age. Can. 


231-238. 
NG; “A classifica- 


M., & Fletcher, 
ning. Can. J. Psy- 


tion of tasks in motor lear 
chol., in press. 


vestigation of individual susceptibility to inter- 


ference. Office of Nav 3 
Cent., Port Washington, L. I, N. Y., 1950. 


(Tech. Rep. SDC 938-1-10.) 


316 


18. Lewis, D., McAllister, Dorothy E., & Adams, J. 
A. Facilitation and interference in perform- 
ance on the modified Mashburn Apparatus: I. 
The effects of varying the amount of original 
learning. J. exp. Psychol., 1951, 41, 247-260. 

19. Lewis, D., Shephard, A. H., & Adams, J. A. 
Evidences of association interference in psycho- 
motor performance. Science, 1949, 110, 271- 
273. 

20. Lewis, D., & Shephard, A. H. Devices for study- 
ing associative interference in psychomotor 
performance: I. The modified Mashburn Ap- 
paratus. J. Psychol., 1950, 29, 35-46. 

21. Lewis, D., & Shephard, A. H. Devices for study- 
ing associative interference in psychomotor 
performance: IV. The Turret Pursuit Ap- 
paratus. J. Psychol., 1950, 29, 173-182. 

22. Lewis, D., & Shephard, A. H. Prior learning as 
a factor in shaping performance curves, Proc. 
nat. Acad. Sci, 1951, 37, 124-131. 

23. Lewis, D., & Smith, P. N. Retroactive facilita- 
tion and interference in performance on the 
modified Two-Hand Coordinator. Office of 
Naval Res., Special Devices Cent., Port Wash- 
ington, L. I, N. Y., 1951. (Tech. Rep. SDC 
166-00-2.) 

24. Loucks, R. B. An experimental study of the ef- 
fectiveness with which novices can interpret a 
localizer-glidepath approach indicator. United 
States Air Force, Air Materiel Command, Day- 
ton, Ohio. USAF Technical Report No. 5959, 
1949. 

25. McAllister, Dorothy, E. Retroactive facilitation 
and interference as a function of level of learn- 
ing. Amer. J. Psychol, 1952, 65, 218-232. 

26. McAllister, Dorothy E., & Lewis, D. Single- 
trial-per-task versus multiple-trial-per-task in 
the acquisition of skill in performing several 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


Michael Humphries 


similar tasks. Proc. Iowa Acad. Sci., 1950, 57, 
407—416. 

. Mitchel, M. J. H. Direction of movement of 
machine controls: III. A two handed task in a 
discontinuous operation. Cambridge, England: 
Psychol. Lab., Ministry of Supply, 1947. (SM 
10018 (S).) 

Mitchell, M. J. H. Direction of movement of 
machine controls: IV. Right- or left-handed 
performance in a continuous task. Cambridge, 
England: Medical Research Council, Unit of 
Applied Psychology, 1948. (MRC 48/371, 
APU No. 85.) 

Norris, Eugenia B., & Spragg, S. D. S. Per- 
formance on a following tracking task (modi- 
fied SAM Two-Hand Coordinator Test) as a 
function of the planes of operation of the con- 
trols. J. Psychol., 1953, 35, 107-117. 

Shephard, A. H. Losses of skill in performing 
the standard Mashburn task arising from dif- 
ferent levels of learning on the reversed task. 
Office of Naval Res., Special Devices Cent., 
Port Washington, L. I., N. Y., 1950. (Tech. 
Rep. SDC 938-1-9.) 

Shephard, A. H. The Toronto Complex Coordi- 
nator. Canad. J. Psychol., 1955, 9, 7-14. 

Shephard, A. H., & Lewis, D. Devices for study- 
ing associative interference in psychomotor per- 
formance: II. The modifed Two-Hand Co- 
ordinator. J. Psychol., 1950, 29, 53-66. 

Speeth, W., & Lewis, D. The effects of alternat- 
ing practice on the performance of two an- 
tagonistic motor tasks. Offce of Naval Res., 
Special Devices Res. Cent., Port Washington, 
L. I., N. Y., 1950. (Tech. Rep. SDC, 938-1-6.) 

Vince, M. A., & Mitchell, M. J. H. Direction of 
movement of machine control (continued). 
Cambridge, England: Ministry of Supply. 
Psychological Lab., 1946. (SM 2861 (S).) 


| 
| 


Journal of Appli 
Vol. 42, hapta piyer 


The Postwar Occupational Adjustment of Emotionally Disturbed 
Soldiers 


Jobn B. Miner * 


Personnel Research Division, The Atlantic Refining Company 


and James K. Anderson 


Conservation of Human Resources Project, Ci ‘olumbia University 


ti Job applicants with a history of severe emo- 
ional disturbance have recently become a 
major problem for those concerned with 
personnel selection. Some employers have 
Adopted an explicit policy forbidding the hir- 
nae people known to have such a history 
e assumption that they do not represent 
800d employment risks. Probably the ma- 
jority, however, have established no specific 
policy; preferring to leave the matter to the 
Iscretion of individual employment inter- 
viewers, Prior to World War II the problem 
vat establishing policy was, in fact, rarely 
formes since interviewers found it very diffi- 
ult to obtain specific information from ap- 
Plicants, 
oe the war it has become common prac- 
l to examine the discharge papers of ap- 
n Are or at least inquire about the condi- 
miie under which a man was separated from 
v itary service. Consequently, many pèr- 
nnel managers have become increasingly 
aware of the very considerable incidence of 
fal disease in the population. Unfortu- 
Pri ie however, little information 1s avail- 
e as to the subsequent work histories of 
men who have suffered from severe emotional 
disturbances. Employers, caught between a 
ae, definite social pressure to hire all dis- 
led veterans and their own misgivings as to 
€ value of men with a history of neurosis or 
Psychosis as workers, have lacked the factual 
ata necessary to making valid policy de- 
cisions, 
Me present paper repre: 
po ene in part, fill this 
of ng in connection witl 
uman Resources Proje 


sents an attempt to, 
gap, utilizing data 
th the Conservation 
ct’s study of in- 
while the senior 


1 . 
This research was carried out V A 
rvation of Hu- 


ay 
\thor was on the staff of the Conse! 
ia University. 


an Resources Project, Columb: 
317 


effective performance during World War II. 
More specifically an effort is made to investi- 
gate some of the assumptions which, for lack 
of specific information, have on occasion 
guided personnel policy in this area. These 
assumptions which form the hypotheses of 
this study may be stated as follows: 

1. Men with a history of neurosis or psy- 
chosis will reveal their difficulties in occupa- 
tional adjustment by having a higher inci- 
dence of unemployment than other men from 
similar backgrounds. 

2. Men with a history of neurosis or psy- 
chosis will reveal their difficulties in occupa- 
tional adjustment by failing to obtain jobs at 
as high a level as other men from similar 


backgrounds. 
Procedure 


for the study were selected from a special 
ve per cent of all enlisted men 
my during and immediately 
Certain information on the 
Je including date of in- 
had been punched 


The Ss 
sample consisting of fi 
separated from the ar 
after World War II. 
men in the five per cent samp) 
duction and reason for separation 
on LBM. cards and these cards were retained at the 
Adjutant General's Office in Washington, From this 
large sample, all men who had been inducted during 
the last four months of 1942 and who had been pre- 
maturely discharged for either neurosis or psychosis 
were selected. Both the neurotics and the psychotics 
were then divided into six groups, using certain 
length and type of service categories. These cate- 
gories were introduced in order to facilitate analyses 
beyond the scope of the present paper and have been 
described in detail elsewhere (3). The 12 groups 
(six neurotic and six psychotic) were reduced still 
further in size using @ random selection process. For 
reasons extraneous to the present study, the number 

h of the 12 smaller samples 


of cases included in eac! 
was not directly proportionate to the frequencies in 


the five per cent sample. 
Since, for the purposes of the ensuing analyses, 


data on the two diagnostic groups (the neurotics 
and the psychotics) only were desired, a weighting 


318 John B. Miner and James K. Anderson 


process was used to combine the frequencies for ae 
12 length and type of service samples. The weig t- 
ing factors applied to data from the six neurotic mig 
six psychotic samples were derived from the actua! 
frequencies present in the five per cent sample. 
Through their use it has been possible to obtain 
composite estimates of the figures that would have 
been obtained if it had been possible to study all 
enlisted men inducted into the army during the last 
four months of 1942 and later discharged prema- 
turely with either a neurotic or psychotic condition. 
The diagnostic composition of these two samples has 
been indicated elsewhere (3). 

In addition to the neurotic and psychotic sam- 
ples, a control group was drawn from the original 
five per cent sample. This group consisted of en- 
listed men inducted in the last four months of 1942 
who were subsequently honorably discharged from 
the army at the end of the war as a result of de- 
mobilization. The Ss included in the control group 
were selected so as to individually match every other 
man in the neurotic sample on four variables: edu- 
cation, race, age, and county of residence. It was 
not possible to obtain perfect matching on the latter 
variable, but every effort was made to keep devia- 
tions to a minimum. Thus, if a man from a rural 
county in Georgia could not be matched with an- 
other man from the same county, he was at least 
‘matched with a man from another similar county in 
his own or an adjacent state. The results of the 
matching on education, race, and age are presented 
in Tables 1, 2, and 3 respectively. These tables also 
contain distributions for the psychotic sample, which 


Table 1 


Percentage of Neurotics, Psychotics, and Controls 
Attaining Various Educational Levels 
Prior to Army Induction 


School Years Completed 
Group N 0-6 


7-8 9-11 12 13+ 


Neurotics 225 18.7 


23.6 28.0 25.3 44 
Psychotics 115 25.2 183 31.3 174 78 
Controls 102 18.6 15.7 314 304 3.9 
Table 2 
Percentage of Neurotics, Psychotics, and Controls 
of White and Negro Race 
Race 
Group N White Negro 
Neurotics 225 85.3 14.7 
Psychotics 115 80.0 20.0 
Controls 102 85.3 14.7 


Table 3 


Percentage of Neurotics, Psychotics, and Controls of 
Various Ages at Army Induction 


Years 


Group N 18-19 20-24 25-29 30-34 35+ 


Neurotics 225 7.6 55.6 17.3 10.2 9.3 
Psychotics 115 95 583 174 87 61 
Controls 102 88 55.9 225 98 3.0 


did not serve directly as a basis for the selection of 
the control group. Nevertheless, neither the psy- 
chotics nor the neurotics differ reliably from the con- 
trols on any of these three variables. Chi-square 
analysis using the .05 level as a criterion of adequacy 
indicates that the matching process was satisfactory. 
The N values noted in these tables require some 
further explanation. It was not possible to obtain 
adequate follow-up information on the employment 
status of all Ss. Postwar information was available 
on 88.7 per cent of the 115 controls, 97.4 per cent of 
the 231 neurotics, and 95.8 per cent of the 120 psy- 
chotics, These smaller numbers have been employed 
in this and all ensuing analyses. Since the total neu- 
rotic sample rather than the half which served as 
a basis for matching has been utilized, the neurotics 
and controls do not appear as perfectly matched. 
Extensive information on all Ss was obtained from 
the files of the Army Records Center in St. Louis. 
These files contain service records, medical records, 
and separation board proceedings. In addition, con- 
siderable follow-up data have been accumulated. 
The VA files were checked through 1953. Many of 
the Ss obtained disability compensation and medical 
treatment from the VA. Others have received ex- 
tensive education and training, frequently coupled 
with vocational guidance interviews and testing. As 
a result of these contacts, the VA has built up a 
rather complete dossier on the postwar employment 
of these men. When the VA data were, for some 
reason, incomplete, a situation which was most likely 
to arise among the control Ss, they were supple- 
mented through the use of questionnaires mailed to 
the veterans’ homes. These questionnaires, which 
inquired specifically into a man’s occupational and 
educational adjustment, were mailed during the latter 
part of 1954. It was only in those few instances 
when an S’s VA file contained no information on 
postwar employment and it was impossible to con- 
tact him by mail that a man had to be eliminated 
from the sample because of inadequate information. 
For the purposes of the present analyses, only the 
most recent available data have been employed. A 
man was categorized as a student, unemployed, oF 
working as of the last date such information was ob- 
tained, whether from the VA files or the question- 
naire. These terminal dates are indicated in Table 4- 
Although the information on the controls tends tO 


Adjustment of Emotionally Disturbed Soldiers 


Table 4 


Year of Last Available Follow-up Information for 
Neurotics, Psychotics, and Controls 


Group 
n Neurotics Psychotics Controls 
Year (%) o) (%) 
1945-46 3.1 4.2 1.0 
1947-48 41 7.0 5.9 
1949-50 12.4 7.0 6.9 
1951-52 26.2 20.9 8.8 
1953-54 54.2 60.9 77.4 


be somewhat more recent than that on the psychi- 
atric groups, the controls were also discharged from 
the Army later. Thus, the period since leaving mili- 
tary service may be considered as of essentially the 
Same duration for all groups. 

In order to investigate the first hypothesis, esti- 
mates of the extent of unemployment among pre- 
maturely discharged neurotics and psychotics were 
obtained using the weighting process previously de- 
scribed. The control group figure was derived by 
applying the six weighting factors developed for use 
with the neurotics to the control data as well. The 
incidence of unemployment among the neurotics and 
Psychotics was compared with that for the matched 
Controls using chi-square techniques. 

he investigation of the second hypothesis requires 
a Specification of the occupational level attained by 
ose men who were employed at the time of fol- 
low-up, For this purpose the system of classification 
according to intellectual demand described in Intelli- 
Bence in the United States (2) was used. The sys- 
tem assigns each occupation to one of four levels, 
“pending on the intelligence required for adequate 
Performance. The levels may be generally described 
as follows: d 


Level 1; High level professional workers, execu- 
tives, those involved in the most tech- 
nical and complex clerical and sales work, 
and large scale farmers. 

Level 2; Retail managers, the more highly skilled 
workers, skilled clerical workers, fore- 
men, wholesale salesmen, technicians, 
lower level professional workers, and 
relatively large scale farmers. 

Level 3: Lower level skilled workers, the semi- 
skilled, routine clerical workers, retail 
sales clerks, most farmers, and proprietors 

L of relatively simple businesses. 

evel 4: Unskilled workers. 


baa the occupational level of all employed mem- 
of the neurotic, psychotic, and control samples 


319 


had been established and the distributions weighted,? 
the values for the neurotics and psychotics were com- 
pared with those for the controls. In addition, the 
number of ñeurotics and psychotics employed at 
each level was compared with similar figures obtained 
as a result of categorizing the occupations found in 
a representative sample of the employed population 
as of 1953. Detailed descriptions of this sample, 
which contained 745 Ss, have been presented else- 
where (2, 4). 

A second analysis was also carried out using the 
total neurotic, psychotic, and control samples. In 
this instance the occupational level of the last known 
job held by those men who were either students or 
unemployed at the time of follow-up was determined 
and these occupational distributions added to those 
of the employed. Comparisons between groups were 
then made in the same manner as previously indi- 
cated. 


Results 


The incidence of unemployment in the rep- 
resentative sample of neurotics inducted in 
late 1942 is 13.4 per cent. This figure com- 
pares with a control value of 7.4 per cent. 
The difference is not reliable (x? = 2.58, P > 
10). Among the psychotics, on the other 
hand, unemployment is quite prevalent with 
48.9 per cent out of work at the time of fol- 
low-up. When compared with the controls, 
this figure yields a highly reliable difference 
(x? = 44.77, P < .01). Thus, the results ob- 
tained with the psychotic group lend consid- 
erable support to the first hypothesis. There 
is no clear-cut evidence, however, that the 
incidence of unemployment among former 
neurotics is any higher than among other men 
from similar backgrounds. 

When the occupational distribution of the 
employed neurotics is compared with that of 
the controls (Table 5), a highly reliable dif- 
ference is obtained (x? = 13.51, P < .01). 
Detailed analysis indicates that this results 
from an underconcentration of neurotics in 
the Level 2—skilled occupations—and a cor- 
responding increase in Level 4—unskilled 
workers—(y’s = 6.76 and 6.71 respectively, 
P< .01 in both cases). When a similar 
analysis using the employed population as of 

2 As with the unemployment figure, weighting of 
the control distribution was carried out using the six 
factors developed for use with the neurotics. The 
use of the factors derived from data on the incidence 
of psychosis within the five per cent sample would 
no have brought about an appreciable change in the 


320 John B. Miner and James K. Anderson 


Table 5 


Percentage of the Employed Neurotics, Psychotics, and 
Controls and of the Employed Population 
Working at Each Occupational Level 


Occupational Level 

Group N 1 2 3 4 
Neurotics* 193 4.6 21.2 48.7 25.5 
Psychotics* 56 7.6 30.6 40.5 21.3 
Controls* 90 6 35.6 51.8 12.0 


Employed Population 745 12.2 31.7 38.4 17.7 


a Based on weighted data. 


1953 is carried out, this tendency to lower 
level employment among the neurotics is even 
more pronounced (x? = 22.96, P < .01). The 
neurotics are reliably below the population in 
Level 1 and Level 2 employment and reliably 
above at Levels 3 and 4 (x’s = 9.38, 8.12, 
6.84, and 5.93, all with P < .01 except at 
Level 4 where P is < 02). The difference 
between the findings obtained with the 
matched controls and the employed popula- 
tion is attributable to the relatively greater 
incidence of Level 1 employment in the popu- 
lation as a whole and the decreased propor- 
tion of Level 3 workers. 

Similar results are obtained when the last 
known occupation and the total neurotic sam- 
ple form the basis for the analysis (Table 6). 
The neurotics and neurotic controls still differ 
(x? = 14.38, P< 01) and this difference is 
still attributable to the lower, Level 2 fre- 
quency and higher, Level 4 frequency among 
the neurotics (xs = 6.13, P < .02 and 8.90, 


Table 6 


Percentage of Neurotics, Psychotics, Controls, and 
Employed Population Working at Each 
Occupational Level at the Time of 
Last Known Employment 


Occupational Level 

Group N 1 2 3 4 

Neurotics* 225 3.9 204 45.3 30.4 

Psychotics* 115 3.6 17.1 29.3 50.0 
Controls* 102 


7 33.1 513 149 


Employed Population 745 12.2 31.7 38.4 17.7 


^ Based on weighted data. 


P < .01, respectively). The neurotic-em- 
ployed population comparison is also reliable 
(x? = 34.54, P < .01), and again the under- 
concentration of neurotics in Levels 1 and 2 
and the overconcentration at Level 4 occurs 
(x°s = 12.78, 10.68, and 16.93, respectively, 
all with P < .01). A reliable difference no 
longer appears at Level 3. 

In spite of the consistent evidence of lower 
level employment among the neurotics, no 
such result is obtained when the distribution 
of the employed psychotics is compared with 
that of the controls (Table 5). The differ- 
ence is not reliable (y= 5.80, P > .10). 
Similarly, the psychotic-employed population 
comparison does not produce evidence of a 
reliable difference (x? = 1.41, P > .10). 

When, however, the last known occupations 
of the students and unemployed are added to 
the employed distribution (Table 6), a sig- 
nificant chi-square value is obtained (x? = 
34.52, P < .01). The psychotics are less 
likely to have been employed in Levels 2 and 
3 occupations than the controls and much 
more likely to have been working at Level 4 
jobs (x*s = 7.53, 11.29, and 29.99, all with 
P < .01). When the psychotics are com- 
pared with the employed population of 1953, 
a similar result is obtained (x? = 63.19, P < 
.01). The psychotics are underrepresented at 
Levels 1 and 2 and heavily concentrated at 
Level 4 (x’s = 7.56, 10.10, and 60.49, respec- 
tively, all with P < .01). 


Discussion 


Although the possibility that the lower level 
of employment among the neurotics and the 
high incidence of unemployment among the 
psychotics might be attributable to differences 
in education, race, or age has been ruled out 
through the use of matched controls, it is pos- 
sible that these findings are, at least in part, 
attributable to a restriction of the occupa- 
tional opportunities available to men who 
have received Certificate of Disability Dis- 
charges for emotional reasons. There is rea- 
son to believe, however, that this is not a 
major factor contributing to the results. Very 
rarely do statements appear in either the VA 
records or the questionnaires of men separated 
for neurosis or psychosis indicating that they 


Adjustment of Emotionally Disturbed Soldiers 321 


have had difficulty obtaining employment be- 
Cause of the nature of their discharges. Fur- 
thermore, the VA records and questionnaires 
Contain considerable evidence that the VA fre- 
quently gave special employment assistance to 
these veterans. As a result, a number appear 
to have been hired for positions in industry 
as well as in government which they, in all 
Probability, would have been unable to ob- 
tain if they had not been discharged with a 
service-connected disability. Certainly, the 
very marked differences in the postwar ad- 
Justment patterns of the neurotics and psy- 
chotics, both of whom received the same type 
of discharge, are not readily explainable in 
terms of restricted occupational opportunity 
due to conditions of separation from military 
service, 
Although the high incidence of unemploy- 
ment among the psychotics results in part 
from the fact that a number of these men 
Were undergoing long-term hospitalization at 
the time of follow-up, this is by no means the 
Only determinant. Slightly over 65 per cent 
of the psychotic unemployed were not in hos- 
Pitals. Among the neurotics 83 per cent of 
the unemployed were not institutionalized at 
ollow-up. None of the controls were in hos- 
Pitals. It is also clear from the analysis of 
Ccupational level at the time of last known 
employment that the majority of the unem- 
Ployed Psychotics last worked at Level 4, un- 
Skilled jobs. This appears to be true of 75 
Per cent of the psychotics who were not work- 
ing at follow-up. The figure for the neurotics 
'S 61 per cent, but of course much fewer were 
Out of work. Because of the low number of 
unemployed controls and the difficulties in- 
volved in applying weighting procedures to 
puch a small sample, a reliable figure cannot 
given for this group, but it is almost cer- 
ae not over 50 per cent, and probably con- 
i erably below that. 
with IS suggests that there are, among those 
a history of psychosis, a large number of 
Men who drift in and out of unskilled jobs, 
se Spend most of their time in an unem- 
ae Status. If this is true, one would ex- 
to find a group of individuals who “just 
i cel to be employed at the time of fol- 
“Up included in the Level 4 frequency. A 


check on this assumption reveals that almost 
40 per cent of the psychotics who were em- 
ployed in unskilled jobs at follow-up had a 
history of long periods of unemployment and 
frequent job changes. Among this latter 
group as well as among the unemployed psy- 
chotics who had last worked in Level 4 po- 
sitions, the usual pattern was not a gradual 
or even a precipitous decline in employment 
level. Characteristically, these men had never 
worked in any job above the semiskilled level; 
perhaps the majority, never above the un- 
skilled. 

Within the limits of generalization possible 
from the present samples, several guidelines 
for personnel policy emerge. Although these 
conclusions cannot be assumed to have valid- 
ity in all employment situations, they should 
in most instances contribute to the efficiency 
of a company’s personnel procedures. Ideally, 
however, local studies, which take into ac- 
count variations in the labor market as well 
as company hiring and promotion procedures, 
should be carried out before policy is estab- 
lished. 

Men with a history of psychosis apparently 
divide into two groups of approximately equal 
size. The first, as far as is presently known, 
make quite satisfactory employees. Further 
study may indicate that in spite of the fact 
that they adjust at approximately the same 
occupational level as other men from similar 
backgrounds, they do so at the expense of 
somewhat poorer performance. There is at 
present, however, no evidence to support this 
conclusion. The second group, on the other 
hand, appears to represent a poor employ- 
ment risk. These men, whose histories may 
be expected to contain evidence of consider- 
able unemployment, are not likely to stay on 
the job very long—either they will have to be 
released or they will leave on their own voli- 
tion. They are very likely to have held only 
unskilled positions previously. Any man with 
a history of psychosis, who has worked at un- 
skilled jobs in the past and who has been un- 
employed over considerable periods of time, 
should, under economic conditions similar to 
those of the postwar period, be hired only 
with the full realization that there is a high 


322 


probability of his soon becoming unemployed 
again. 

By way of contrast, the neurotics as a group 

would appear to represent a much better em- 
ployment risk. They are not likely to spend 
long periods among the ranks of the unem- 
ployed. However, men with a history of neu- 
rosis cannot be expected to attain supervisory 
and skilled positions as frequently as other 
employees. An employer who hires such a 
man with the expectation of moving him into 
a higher level position is likely to be disap- 
pointed. This does not imply that the for- 
mer neurotic is necessarily a poor worker. 
The limited evidence we have suggests that as 
a group neurotics are likely to perform just 
as well as other employees, at least in lower 
level jobs (1). It seems probable, however, 
that they do typically lack whatever qualities 
make for placement or continued employment 
in supervisory and skilled positions. 


Summary 


Employment status approximately 10 years 
after discharge was determined for representa- 
tive samples of men inducted into the army 
during late 1942 and prematurely separated 
for neurosis and psychosis. Similar informa- 
tion was obtained on a control group matched 


John B. Miner and James K. Anderson 


in terms of education, age, race, and county 
of residence. It was found that nearly half 
of the men with a history of psychosis were 
unemployed at follow-up as compared with 
approximately seven per cent of the controls. 
The neurotics did not have a reliably higher 
incidence of unemployment, but did tend to 
concentrate in low level occupations, particu- 
larly those of an unskilled nature. The pro- 
portion of men working in skilled and super- 
visory positions among the neurotics was con- 
siderably below that of the controls. This 
downward shift in occupational level was not 


present among those psychotics who were em- 
ployed. 


Received January 2, 1958. 


References 


. Markowe, M. Occupational psychiatry: An his- 
torical survey and some recent researches. J. 
ment. Sci., 1953, 99, 92-101. 

. Miner, J. B. Intelligence in the United States. 
New York: Springer, 1957. 

. Miner, J. B., & Anderson, J. K. Intelligence and 
emotional disturbance: Evidence from Army 
and Veterans Administration records. J. ab- 
norm. soc. Psychol., 1958, 56, 75-81. 

. Tomkins, S. S., & Miner, J. B. The Tomkins- 


Horn Picture Arrangement Test. New York: 
Springer, 1957. 


n 


3 


he 


Journal of Applied Psycholo, 
Vol. 42, No. a 1988 


Adequacy of the Residual Sensory Cues for Psychomotor Performance 
of Arm Amputees*'* 


Hilde Groth and John Lyman 


University of California, Los Angeles 


Amputation of the upper extremity deprives 

the individual not only of much of his normal 
motor function but also of a large amount of 
the sensory information which regulated this 
motor function. Although artificial arms con- 
stitute a fairly efficient mechanical replace- 
ment for some of the incurred motor loss, ex- 
perimental results indicate that performance 
of visuo-motor tasks with a prosthesis shows 
a considerable decrease of performance speed 
(3). For the purpose of future design modifi- 
cations it is of practical interest to ascertain 
whether this performance decrement should be 
attributed primarily to the loss of sensory in- 
formation and, hence, to inadequate sensory 
regulation of the prosthesis movements or to 
a relatively inadequate motor replacement by 
the prosthetic device. Both of these possi- 
bilities focus on a third alternative, an in- 
Crease in “central integration time.” For the 
amputee any psychomotor task is necessarily 
much more complex than for the nonamputee 
Since the prosthesis is interposed between the 
muscular motion and the control operation 
as an additional technical link. Aside from 
Mechanical delays in the device itself, coordi- 
hation of this extra link into the operating 
cycle conceivably can prolong the reaction 
time, 
_ The specific purpose of this study was to 
investigate the extent of the usefulness of re- 
maining sensory cues for prosthesis perform- 
ance on two visuo-motor tasks. The sensory 
Cues involved consist of a composite of afferent 
information received from the pressure of the 
ae 


ne This work was supported by Contract No. VAm- 
i O between the Veterans Administration and the 
aent of Engineering, University of California, 
auth, ngeles. The opinions expressed are those of the 
et ors and do not necessarily represent those of the 

ore Administration. : 
Lead he authors wish to thank C. L. Taylor, Project 
Sear ay for his cooperation in permitting this re- 
ch to be undertaken, and J. R. Zweizig for the 


Velo) : E k 
Barano of the electronically controlled ap. 


socket on the stump, from the pressure exerted 
by harness straps, from movements of the 
stump and shoulder girdle during prosthesis 
operation and, in certain cases, from move- 
ments of and pressures within the biceps 
cineplastic tunnel as well as an undefined 
amount of visual information.? For labora- 
tory investigation of the role of these cues a 
manipulation and tracking task was chosen to 
provide information about the following varia- 
bles: 


Task 1—Manipulation Task 
a. Adequacy of object weight perception 
with pressure cues. 
b. Adequacy of response speed on a task re- 
quiring gross motions with the prosthesis. 
c. Problem of “central integration time” in 
a complex task. 


Task 2—Tracking Task 

a. Adequacy of pressure cues for dynamic 
force adjustments. 

b. Adequacy of response speed when no 
gross motions of stump and prosthesis 
are required. 

c. Problem of “central integration time” in 


a complex task. 


Accuracy of weight perception during ma- 
nipulation was considered to be a fair crite- 
rion for the adequacy of the available sensory 
cues. An excessively high level of grasp force 
during manipulation in comparison to the 
level of force exerted by nonamputees on the 
same task would be indicative of too little 
“feel” in the prosthesis. Accuracy of dynamic 
pressure discrimination, i.e., performance ofa 
task requiring continuous adjustments of grasp 
force without change in terminal device open- 
ing, by exclusion of gross motions, was ex- 


sA cineplastic tunnel is a skin-lined tunnel con- 
structed surgically in the biceps or other muscle and 
used as a “muscle motor” to power the control cable 
for prehension. A description has been published 
elsewhere in this journal (3). 


323 


324 


pected to provide an estimate of the pressure 
sensitivity of the cineplastic tunnel or alterna- 
tively of the shoulder girdle areas underneath 
the harness straps. 


Procedure 
Subjects 


Group I: 10 right-handed engineering students per- 
forming with the right hand. 

Group II: 10 right-handed engineering students per- 
forming with the left hand. 

Group III: 8 unilateral below elbow amputees wear- 
ing harness-controlled Prostheses, performing 
with the hook. 

Group IV: 4 unilateral below elbow amputees wear- 
ing cineplasty-controlled prostheses, performing 
with the hook. 

Group V: 3 unilateral above elbow amputees wearing 


harness-controlled prostheses, performing with 
the hook. 


All amputees were regular prosthetic hook Wearers 
and had served repeatedly as research amputees in 
the laboratory. The students were recruited from 
classrooms, 

Performance of unilateral amputees with the arti- 
ficial limb was considered as performance with the 


nonpreferred limb regardless of which side had been 
amputated. 


Apparatus and Tests 


Task 
the simple motion elements 
release. Task complexity 


rangement was placed on a table 30” high. 


Details of the instrument 
in detail elsewhere (2). A 
“random” target problem was generated on a dual 
beam oscilloscope in the form of a vertically Moving 
line across the face. The line Was tracked with a 
light dot controlled by prehension force variations, 
The same split cylinder force transducer as was ma- 
nipulated in Task 1 was placed on a support and 
used as a controller, The target line had a maximal 
vertical motion range of +3.0 cm. from the center of 


Hilde Groth and John Lyman 


the scope face. The tracking dot had its zero posi- 
tion 3.0 cm. below the center and had a maximal 
vertical range of +6.0 cm. The force required for 
tracking with this dot was approximately 165 gm./ 
cm. The apparatus was placed on a table 30” high; 
the force transducer rested on a 10” stand in front of 
the scope. Number of target hits, time on target 
and time on target per hit contributed the criterion 
measures, 

Terminal devices for amputees. A left or right 
APRL voluntary closing hook, modified by removal 
of the locking mechanism, was worn by each am- 
putee. These hooks have been described previously 
(3, 4). Voluntary closing hooks were chosen be- 
cause the amputee had to apply control cable tension 
in direct proportion to the prehended load. The 
ratio between output force at the hook tip and con- 
trol cable force was measured as 2.5:1. Prehension 
force of 1,000 gm. thus required a cable pull of about 
2,500 gm. 

The hook tips were covered with commercial rubber 
sleeves in order to obtain similar conditions of surface 
friction for bare hand and hook performance. The 
coefficient of static friction between skin and alumi- 
num was approximately 0.7 and for the rubber 
sleeves it was 0.5. 

Experimental noise. Since there was a possibility 
that amputees would respond to some auditory cues 
from the Prosthesis, head phones were worn by all 
Ss. A random noise generator provided the mask- 


ing noise. The head Phone output was adjusted to 
90 db SPL. 


Routine 


The following six treatments were administered in 
a subject by treatment design during a single ex- 
perimental session: 


1. Pursuit tracking task; duration: 5 min, 

2. Manipulation task, self-paced, regular pattern; 
duration: 10 min, 

3. Manipulation task, self-paced, random pattern; 
duration: 10 min, 

4. Manipulation task, experimenter-paced, regular 
pattern; duration: 10 min, 

5. Manipulation task, experimenter-paced, random 
Pattern; duration: 10 min. 

6. Manipulation task, speed trial. 


Treatments 2, 3, 4 and 5 were given in random order. 

Tracking task. The cylinder was resting on a sup- 
Port in front of the oscilloscope and $ pursued the 
moving target line with the light dot for five minutes. 


eight-hour job. (b) Experimenter paced (EP); the 

stimulus lights changed at 2-sec. intervals, 

(a) Regular pattern. The 

Ss started at the lower left side of the control board 

and placed the cylinder into the adjacent hole, work- 

d and away from the body. 
the signal for each move 


Psychomotor Performance of Arm Amputees 325 


Table 1 
Mean Prehension Forces and Variabilities on Manipulation Task 
Nonamputees Amputees 
ir PF PF PF 
Treatments X (gm.) s (gm.) X (gm.) s (gm.) 

Self-paced 1,022 561 1,478 769 
Regular pattern 
Self-paced 1,174 617 1,359 558 
Random pattern 
Experimenter-paced 1,213 254 1,328 522 
Regular pattern 
Self-paced 1,305 898 1,332 519 
Speed trial 


Note.—Just slip force: 1, bare hands ~604 gm.. 2. hooks ~844 gm. 


Was given by the flashing of a small light bulb 
fastened in the center of the upper border of the 
display panel. (b) Random pattern. The Ss placed 
the cylinder into that hole of the control panel which 
corresponded in location to the lighted circle of the 
gd matrix. In the SP condition, the display 
He t would change to the next position as soon as 
ihe contact of the preceding move was made, For 
‘ane condition, the location of the stimulus light 
d change at two seconds’ intervals. 
seat Speed trial. The Ss were asked to perform five 
ere of the regular pattern as fast as possible. No 
Phones were worn for this condition. 
a Each task was explained in detail and demon- 
rated by E before the experimental run. 


Nonamputee Ss grasped the cylinder around the 
upper rim with the pads of all five fingers leaving 
the slot between the thumb and index finger. Ampu- 
tee Ss grasped it with the length of the rubber cov- 
ered hook tips with the slot pointing away. By this 
method we were able to approximate the sizes of the 
contact areas of bare hand and hook performance to 
satisfy the equation, F =P X A, where F = force, P 
= pressure, A = area of contact. 

All treatments were administered while the Ss were 
standing. 

Calculations. All comparisons for statistical infer- 
ence were made by nonparametric methods since 
data of amputee performance showed neither nor- 
mality of trait distribution nor homogeneity of vari- 
ance (8). The significance level was set at P<.01. 


Table 2 
Mean Times per Transport and Variabilities in Manipulation Task 
Nonamputees Amputees 
T/tr T/tr _T/tr T/tr 
Treatments X (sec.) s (sec.) X (sec.) s (sec.) 
Self-paced 55 14 1.01 32 
Regular pattern 
Self-paced -70 21 1.20 36 
Random pattern 
Experimenter-paced 60 12 90 16 
Regular pattern 
40 09 .78 .23 


Self-paced 
Speed trial 


326 


Hilde Groth and John Lyman 


Table 3 


Mean Tracking Performance and Variability of Amputees and Nonamputees 


Criterion Measure 


Total number of hits 

Number of hits—first minute 
Number of hits—fifth minute 

Time on target—first minute 

Time on target—fifth minute 

Time on target per hit—first minute 
Time on target per hit—fifth minute 


Amputees Nonamputees 

Xx s Xx s 
241.14 68.92 302.75 39.49 
49.71 16.00 62.30 10.91 
48.36 14.63 62.15 8.38 
21.93 8.07 31.50 3.24 
22.29 11.13 28.20 5.51 
0.48 0.26 0.53 0.13 
0.57 0.55 0.47 0.14 


Results ¢ 


Manipulation Task Comparison of Prehension 
Forces (PF) 


Comparisons of amputee with nonamputee 
performance on this task were made between 
all amputees (Groups III, IV, V) and the 10 
nonamputee Ss (Group II) performing with 
the nonpreferred hand. 

Inspection of the mean PF values for four 
treatment conditions for these groups showed 
consistently higher force levels for the ampu- 
tees. However, statistical comparisons of the 
median differences by the Mann-Whitney U 
test failed to reach significance. 

No data could be reported for the experi- 
menter-paced random pattern condition since 
none of the amputees were able to complete 
this task. Complexity of the task led to con- 


fusion and performance breakdown in each 
case. 


Mean values an 


d variabilities are shown in 
Table 1. 


Manipulation Task: Comparison of Times 
Per Transport (T/tr) 


Statistical comparison between amputee and 
nonamputee performance by the Mann-Whit- 
ney U test showed that amputees performed 
consistently slower. This difference was sig- 
nificant for each of the four conditions. The 


* The statistical tables have been deposited with the 

merican Documentation Institute. Order Document 
No. 5690 from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress, Wash- 
ington 25, D. (om remitting in advance $1.25 for 
microfilm or $1.25 for Photocopies. Make checks 
payable to Chief, Photoduplication Service, Library 
of Congress. 


mean values and variabilities are reported in 
Table 2. 


Manipulation Task: Performance of Amputee 
Subgroups 


The small size of the amputee subsamples 
did not permit any statistical comparisons of 
performance, However, inspection of the raw 
data indicated a lack of performance differ- 


ences due to amputation level or method of 
muscle-prosthesis coupling. 


Tracking Task: Comparison of Target Hits 

Tolerance limits for “on target” were set’ as 
a 6 mm. wide band on the scope face. Visu- 
ally this meant for the S that he was on tar- 
get whenever the perimeter of the tracking 
dot touched the target line. 

Comparisons of amputee with nonamputee 
performance have been made between all am- 
Putees (Groups III, IV, V) and all nonampu- 
tees (Groups I, IT), Combining Groups I 
and II seemed justified since none of the Ss 
had any Previous experience with a similar 
task. 

Statistical comparisons were made for three 
time intervals: (a) total number of hits dur- 
ing the five-minute trial; (b) number of hits 
during the first minute; (c) number of hits 
during the fifth minute. None of these com- 
Parisons reached statistical significance, 


Tracking Task: Comparison 
Target 


Comparisons have 
and fifth minutes of 
the differences reache 


of Time on 


been made for the first 
Performance. None of 
d statistical significance. 


a 


Psychomotor Performance of Arm Amputees 327 


Tracking Task: Comparison of Time on 
Target Per Hit 


Comparisons for first and fifth minute of 
performance again failed to differentiate be- 
tween amputees and nonamputees. 

Table 3 summarizes the mean values and 
Variabilities for the tracking task. 


Tracking Task—Performance of Amputee 
Subgroups 


No statistical comparisons were possible be- 
Cause of the small size of the subsamples. In- 
Spection of the raw data again failed to indi- 
cate pronounced performance differences at- 
tributable to amputation level or method of 
Coupling. However, one of the Ss with cine- 
Plastic prosthesis control could not complete 
the task because of a tremor in the biceps. 


Discussion 


To the authors’ knowledge, no systematic 
Investigation of the role of cutaneous and 
kinesthetic information during active motor 
Performance with artificial arms has been 
Conducted previously. The present study also 
falls far short of the goal of assessing the im- 
Portance of the various available cues in psy- 
Chomotor skills. For example, it would have 
een desirable to study performance changes 
during selective anesthetization of tactile and 
Proprioceptive afferents without also affecting 
the motor efferents. The authors made some 
Preliminary attempts to do this and failed, 
or present methods of anesthesia do not 
Seem to permit adequate separation of sen- 
Sory components. 
he problem, therefore, has been approached 
On a less refined level of analysis and the re- 
Sults provide at best an initial evaluation of 
i © practical utility of the composite of all 
Tesidual cues, 
he principal findings of this experiment 

n be summarized as follows: i 
a Amputees, regardless of amputation level 
n type of prosthesis control, made adequate 
Se of the available cues. This was evidenced 
ad Prehension force adjustments when ma- 

Pulating an object of unknown weight as 
Well as by the tracking scores which also were 
he endent on the accuracy of pressure dis- 

Minations. 


2. Amputees showed a pronounced impair- 
ment in response speed when the task in- 
volved gross motions as was shown on the 
manipulation task. Response speed on the 
tracking task, which did not involve gross 
movements with the prosthesis, seemed to be 
comparable to normal hand performance. 

3. Task complexity requiring additional 
“central integration time” as accomplished by 
imposition of a random pattern upon the ex- 
perimenter paced manipulation task led to 
breakdown of performance for all amputees. 

From these results we inferred that ampu- 
tees apparently have fairly adequate sensory 
control for execution of simple visuo-motor 
tasks. However, visuo-motor tasks requiring 
gross motions show impairment of amputee 
performance due to an increase in time needed 
for completion of a given movement with the 
prosthesis. This was brought out when speed 
stress was increased by changing the task com- 
plexity. 

The present investigation does not permit 
us to estimate the relative importance of vi- 
sion to the reliable interpretation of the cu- 
taneous and proprioceptive cues perceived dur- 
ing performance. It seems to be fairly well 
established that vision alone in the absence 
of other sensory information cannot serve as 
an adequate source of information for motor 
performance (6, 7). This would be expected 
to be particularly the case where the task 
requires accurate prehension force control as 
was true in our tracking task. On the other 
hand it has been found that blind-folded per- 
formance will cause a decrement in manipula- 
tion as well as in travel time, and even after 
prolonged practice these components of per- 
formance will not reach the level of proficiency 
attainable with unrestricted vision (5). An 
experiment in this laboratory which compared 
a prosthesis of the type used here with a 
prosthesis in which prehension force was elec- 
trically actuated by an on-off switch showed 
an increase in reaction time only for the elec- 
tric arm when visual cues were removed (1). 
Because of the type of actuation fewer non- 
visual cues were available for the electric arm. 

The present results as well as some of our 
earlier ones (1, 4) seem to indicate clearly 
that amputees are capable of using nonvisual 


328 


as well as visual cues effectively with present 
prosthesis designs. Accordingly, we feel that 
no specific sensory aids should be recom- 
mended for prosthesis design until both the 
adequacy and reliability of nonvisual cues 
have been studied for tasks in which the 
obstruction of vision has been investigated 
further. 


Summary 


This investigation was designed to evaluate 
three possible sources of performance decre- 
ments of amputees: (a) Inadequate sensory 
information during prosthesis performance. 
(b) Increase in duration of the mechanical 
motions in comparison to normal movements. 
(c) Increase of the “central integration time” 
during a task due to the complexity added by 
the prosthesis. Identification of the principal 
source of decrement would serve as an indi- 
cant for the site of modification of the engi- 
neering design of the Prosthesis. This modi- 
fication could take the form of sensory aiding 
devices, improvement of the mechanical func- 
tion, or simplification of the coupling to the 
control musculature. 

Two visual motor tasks were chosen to pro- 
vide information about these possible sources 
of performance deterioration: (a) Manipula- 
tion task; (b) pursuit tracking task. Twenty 
nonamputees and 15 unilateral amputees 
served as experimental Ss. The amputees 
consisted of eight below-elbow amputees with 
harness control, four below-elbow amputees 
with cineplastic control, and three above- 
elbow amputees with harness control, 

The principal findings were: (a) Perform- 
ance showed that all amputees received ade- 
quate pressure cues; (b) all amputees showed 


Hilde Groth and John Lyman 


a pronounced impairment in response speed 
when the task required gross movements with 
the prosthesis; (c) task complexity inducing 
time stress by requiring additional “central 
integration time” led to performance break- 
down and confusion. 

Since the study did not permit an estima- 
tion of the adequacy and reliability of cu- 
taneous and proprioceptive cues in the ab- 
sence of vision, another investigation in which 
vision is obstructed has been suggested. 


Received January 2, 1958, 


References 


m 


- Gottlieb, M., Santschi, W., & Lyman, J. Evalua- 
tion of the Model IV-E2 electric arm. De- 
partment of Engineering, Univer, California, 
Los Angeles, 1953. (Rep. 53-23.) 

2. Groth, Hilde. An experimental assessment of pre- 
hension force as a measure of effort in psycho- 
motor skills. Unpublished doctoral disserta- 
tion, Univer, California, Los Angeles, 1957. 

3. Groth, Hilde, & Lyman, J. Relation of the mode 

of prosthesis control to Psychomotor perform- 


ance of arm amputees. J, appl. Psychol, 
1957, 41, 73-78. 


4. Groth, Hilde, & Lyman, 
modes of prosthetic 
by arm amputees, 
325-328. 

5. Huiskamp, J., Smader, R. C., & Smith, K, N. Di- 
mensional analysis of motion: IX, Compari- 

son of visual and nonvisual control of com- 
ponent movements. J, appl. Psychol, 1956, 
40, 181-186. 

. Lassek, A.M. Effect of combined afferent lesions 


on motor function. Neurology, 1955, 5, 269- 
272. 


7. Twitchell, Th. E. Sensory factors in purposive 
movement, 


J. Neurophysiol., 1954, 17, 239- 
252. 


8. Walker, H, M. & Lev, J. Statistical inference. 
New York: Holt, 1953. 


J. A comparison of two 
prehension force control 
J. appl. Psychol., 1957, 41, 


Journal of Appli pi y 
va, of A op ied Psychalagy 


Curriculum Assessment with Critical Incidents 


Albert S. Glickman 


U. S. Naval Personnel Research Field Activity, Washington + 


and T. R. Vallance 


Human Resources Research Office 


By and large, even in industry, adequacy 
of training programs has been assessed by 
Subjective judgment rather than by research 
methods (5). As Flanagan (1, 2) has re- 
marked, there is need to supplement the usual 
Collection in conference of “leaders” or “ex- 
perts” by systematic collection and analysis 
of factual data. 

This investigation explores the use of “criti- 
cal incidents” (1, 2, 3) for assessing the rele- 
vance of various aspects of a training pro- 
gram to the job performance requirements of 
Its graduates. Specifically, it involves the 
relevance of subject matter taught in the 
Navy’s Officer Candidate School (OCS) to the 
duty requirements of new officers aboard de- 
Stroyer-type ships. 

The 17-week OCS training aims to prepare 
the new ensign to profit from experience and 
oe achieve a satisfactory level of job per- 
ormance. It is not intended to produce spe- 
Cialists, Rather, the purpose is to provide 
Junior officers with the familiarity in subject 
Matter areas containing technical problems 
needed to develop an understanding of a va- 
Nety of technical functions, and to develop 
ability to supervise technicians. 

The critical incident procedure, as adapted 
ere, takes cognizance of the supervisory na- 
ture of the junior officer's duties and at the 
Same time permits the use of duty perform- 
5 behaviors of sufficient specificity to in- 
iid OCS curriculum elements most and 
fast relevant to significant shipboard duties. 
“so, by concentrating upon observed actual 


1 : 
were his research was conducted while the authors 
© with the Officer Personnel Research Program 
der sa American Institute for Research, working un- 
Peer aal Nonr 890(01) with the Office of Naval 
at aa A more extensive report was prepared at 
ime (4). The viewpoints expressed herein are 
be construed as those of the U. S. Navy or 
rmy. 


performances on duty, the procedure makes it 
possible to identify duty areas which are not 
covered by instruction, thus contributing to a 
further definition of what the objectives and 
content of training should be. 

The essential questions to which this re- 
search is addressed are these: “What are the 
things which happen frequently to ensigns, or 
which their superiors think they ought to be 
able to handle soon after reporting aboard?” 
and then, “Are these the things to which at- 
tention and emphasis are also given by OCS 
instruction?” 

Our legitimate concern is exclusively with 
the performance requirements (and relevant 
training) reflected in the content of incidents, 
whether they be found in reports of “effective” 
or “ineffective” incidents. This research does 
not tell us how much OCS training contributes 
to causing effective or ineffective incidents. 


Determining the Gross Relevance of OCS 
Curriculum to Duty Incidents 


Basic Data 


An earlier study (6) had accumulated in- 
cidents involving destroyer officers from all 
training sources. Of those, 1,073 incidents 
reported on general line ensigns constituted 
our pool of basic data. This sample of in- 
cidents is used here to provide a description 
and operational definition of the critically sig- 
nificant elements of the duties of new officers 
aboard destroyers. Incidents were typed on 
IBM cards to facilitate handling and analysis. 


Analysis of the Incident Pool with Reference 
to Curriculum Areas 


The incident cards were submitted to sub- 
ject matter specialists, OCS officer-instructors, 
who carried out a review sequence that ul- 
timately required that each incident be identi- 


329 


330 Albert S. Glickman 
fied with a lesson in the lesson-plan of one of 
the seven sections of the curriculum: 1. Sea- 
manship, 2. Orientation and Military Justice, 
3. Navigation, 4. Operations, 5. Naval Weapons 
and Fire Control, 6. Engineering and Damage 
Control, and 7. Military. Category 8, labelled 
“Other,” was established for those incidents 
which reflected motivation and interest in the 
job rather than the possession of specific skills 
or knowledges. 

The instructional set for sorting effective 
incidents concerned the curricular locus of 
skills, knowledges and attitudes shown in each 
incident. For the ineffective incidents the 
emphasis was on the skills, knowledges and 
attitudes which, if learned, could have pre- 
vented the ensign from getting involved in the 
incident. 

Within each subject matter category, in- 
cidents were classified in two ways: “Taught” 
—the relevant skills, knowledges and attitudes 
are currently the subject of specific lessons of 
instruction; and “Not Taught”—pertains to 
material not covered, but which logically be- 
longs in one of the school sections and prob- 
ably would be taught were time and facilities 
available. All “Other” incidents were, of 
course, of the “not taught” variety. 

No assumption can be made that individual 
incidents, or even incidents of a given sort, 
are equally significant from the operational 
standpoint. Yet, if a large number of the 
critical incidents reported fall into a given sub- 
ject matter area, it is difficult to escape the 
conclusion that there is much important ac- 
tivity aboard ship to which that area of in- 
struction is relevant. Likewise, if the number 
of incidents in certain instructional areas is 
small, unless it is found that the incidents are 
individually of vital significance, it would 
seem that the learnings in such areas are not 
heavily drawn upon in the performance of a 
new officer’s duties. 

It appears from Table 1 that it is from his 
background in the courses Orientation and 
Military Justice, Seamanship, and Operations, 
that on reporting to a destroyer the new 
ensign might expect to draw most frequently, 
while being less likely to be called upon to 
make use of what he learned in Navigation, 


and T. R. Vallance 


Engineering and Damage Control, and Naval 
Weapons and Fire Control.” 

Can the fact that 62.9% of the classifiable 
incidents are said by instructor-judges to be 
taught at OCS be considered to reflect a 
“good” state of affairs? Since the limits im- 
posed by time and facilities have not been 
considered and since there is no standard of 
what “percentage taught” can be considered 
good, this question cannot be answered in an 
absolute sense. However, we might say that 
the higher the “percentage taught” the better. 

Thus, it might be presumed that of the six 
academic courses, Seamanship and Orientation 
and Military Justice most completely cover 
their respective areas, and that the Naval 
Weapons course least adequately treats of the 
kinds of situations that destroyer officers re- 
port as critical. However, there are instances 
where 50 or more incidents are considered 
relevant to one hour of instruction, leading one 
to discount the direct interpretation of “per 
centage taught” as a sufficient criterion of the 
adequacy of subject matter coverage. 

References to the number of incidents as- 
signed to specific hours of instruction, along 
with further examination of the contents of 
sessions by subject matter experts, would seem 
to provide a more meaningful employment of 


data such as these for purposes of curriculum 
analysis. 


Developing an Index of Incident-Behavior 
“Importance for Early Usefulness” 
Rationale 


The main objective of the next phase was 
to obtain expert judgments regarding the sig- 


* Some incidents were considered relevant to more 
than one lesson. However, in Table 1, no incident 
was counted more than once, since when an inciden 
was identified with more than one lesson, these Jes- 
sons were always in the same curriculum section. 
The judges could not classify 17 of the original 1073 
incidents. : 

‘Tf the “percentage taught” figures for effective 
and ineffective incidents for the six academic courses 
(all except Military) are arranged in rank order, the 
correlation between ranks for “effectives” and “in- 
effectives” is .79, suggesting that, whether relevant 
effective or ineffective incidents are used, the relative 
coverage by courses is much the same. Table 1 a/S° 
reveals that overall the “percentage taught” is 2° 


much different for effective and ineffective incidents 
(64.6% vs. 62.1%). 


Curriculum Assessment with Critical Incidents 331 


Table 1 
Results of Sorting Critical Incidents by Subject Matter Areas 
Effective Ineffective Summary 
N ya N % N %e 

1. Seamanship 
Taught 52 48 106 9.9 158 14.7 
Not-Taught 12 1.1 40 3.7 52 48 
Total 64 6.0 146 13.6 210 19.6 
% Taught” 81.3 72.6 75.2 

2. Orientation and Military Justice 
Taught - 125 11.6 240 224 365 34.0 
Not-Taught 46 43 88 8.2 134 125 
Total 171 15.9 328 30.6 499 46.5 
% Taught? 73.1 73.2 73.1 

3. Navigation 
Taught 5 0.5 4 0.4 9 0.8 
Not-Taught 0 0.0 1 0.1 1 0.1 
Total 5 0.5 5 0.5 10 0.9 
% Taught” 100.0 80.0 90.0 

4. Operations 
Taught 19 1.8 66 6.2 85 7.9 
Not-Taught 22 24 57 5.3 79 7A 
Total 41 3.8 123 11.5 164 153 
% Taught? 46.3 53.7 51.8 

5. Weapons 
Taught 7 0.7 8 0.7 15 1.4 
Not-Taught 11 1.0 28 2.6 39 3.6 
Total 18 1.7 36 3.4 54 5.0 
% Taught? 38.9 22.2 27.8 

6. Engineering 
Taught 16 1.5 14 13 30 2.8 
Not-Taught 4 0.4 18 1.7 22 2.1 
Total 20 1.9 32 3.0 52 48 
% Taught” 80.0 43.8 57.7 

7, Military 
Taught 0 0.0 2 0.2 2 0.2 
Not-Taught 2 0.2 0 0.0 2 0.2 
Total 2 0.2 2 0.2 4 0.4 
% Taught? 0.0 100.0 50.0 

8. Other 
Taught 0 0.0 0 0.0 0 0.0 
Not-Taught 26 2.4 37 34 63 5.9 
Total 26 2.4 37 3.4 63 5.9 
% Taught? 0.0 0.0 0.0 

otal 
Taught 224 20.9 440 410 664 619 
Not-Taught 123 115 269 25.1 392 36.5 
Total 347 323 709 66.1 1056  98.4° 
% Taught 64.6 62.1 62.9 
ded to nearest tenth. 


pi 
b percentages in these columns are b; 


respecti centages in these rows represen! 


Mg subject categories (i.e., “taught"/“total 
enteen incidents (1.6%) were incapable of 


ased on the original t 
t the number of “tau. 


otal number; V/1073 roun 
ght” incidents in relation 


being judged with confidence and are not inclu 


to the total number of incidents within 


aded in this summary. 


332 


nificant aspects of duty for which training 
should prepare the new ensign. j 
To place the notion of importance clearly in 
the framework of training, it was reasoned 
that to the extent that preduty training pro- 
vides the trainee with skills, knowledge, or 
attitudes required of him in the first few 
months of duty, or a sound background on 
which experience will develop them, that train- 
ing is successful. To the extent that it con- 
centrates on that which the trainee will not be 
called upon to use until after he has had a 
year or more of duty, or has been sent to 
other schools to acquire or reacquire them, 
initial training is being misdirected. From this 
standpoint, the immediacy of the opportunity 
or requirement to handle any given situation 
would be an index of its “importance.” It is 
in this sense that “importance,” standing for 


“importance for early usefulness,” is used in 
this article. 


Procedure 


A questionnaire, the Junior 
quirements Checklist (JOTRC) 
forms containing an aggregate 
final format evolved from four 
versions, 


Each form listed approximately 100 incidents, and 
was divided into two parts. Part I contained ef- 
fective incidents and Part II ineffective incidents for 
Forms 1, 3, 5, 7, and 9. For Forms 2, 4, 6, 8, and 
10, the order was reversed. About one hour was 
needed to fill out a form. The incidents related to 
all OCS instructional areas, and included those judged 


earlier to be among the “not taught” as well as those 
included in the curriculum, 


The specific questions were: 


1. For the situations involving effective incidents: 
Ow soon after reporting aboard would you expect 
the new officer to be able to do this?” 

2. For the situations in 
“How soon after re 
the new officer to 
satisfactorily ?” 


The directions specified the writing of the num- 
ber of months (directly after being commissioned), 
ranging from 0 (for judgments that the reserve officer 
should be able to perform well immediately on Te- 
porting aboard) to 12, “x” was to be used if the 


* This number is smaller 
two instructor sorts becausi 
(a) of clear duplication ; 


Officer Training Re- 
» Was prepared in 10 
of 985 items. The 
preliminary try-out 


« 


1 volving ineffective incidents: 
porting aboard would you expect 
be able to handle this situation 


than that involved in the 
e of deletion of incidents: 
(b) which might be mis- 
interpreted by the judges because the situations rep- 
resented contraventions of Navy policy (eg., a num- 
ber of incidents originally submitted as “effectives,” 
turned out to be violations of regulations), 


Albert S. Glickman and T. R. Vallance 


respondent would not expect the new officer to be 
able to handle the given type of situation satisfac- 
torily until after he had been aboard for more than 
a year. “N” was to be used if the reporter felt un- 
able to make a judgment because of wording or in- 
terpretation of the incident items.5 

Two copies of the checklist were sent to each of 
170 destroyer-type ships, one for the commanding 
officer (CO) and one for his executive officer (X0). 
Each of the 10 forms was sent to 30 to 50 officers. 


A total of 301 JOTRCs were returned in time to be 
analyzed. 


For each of the 985 incidents, the estimates ob- 
tained from COs and XOs were tallied and a median 
value was found. We call these medians TESP 


(“time expectancy for satisfactory performance”) 
values. 


Analyses of Results 
Reliability of TESP Values 


Using Form 8 of the JOTRC, a reliability 
estimate for the CO-XO TESPs was obtained. 
The 44 questionnaires returned (out of 50) 
were divided in two groups of 22 each. For 
each item two TESPs were obtained. Cor- 
relation of the two sets of TESPs over the 95 
items gave a coefficient of .93, 


Distribution of TESP Values 


The distribution of TESPs for the 985 
JOTRC items was divided into incidents of 
“effective” (E) and “ineffective” (I) per- 
formance. 

There was a distinct difference in TESPs as- 
signed to E and I performances (median values 
of 6.0 and 2.5 months, respectively). The of- 
ficers responsible for operating these ships 
were most immediately concerned with keep- 
ing the novice (and themselves) out of trouble, 
and were content to wait longer for a quality 
of performance exceeding what is normally 
considered satisfactory. Overall the TESPs 
tended to pile up at the bottom (more “im- 
portant”) end of the scale (Table 2). 

These estimates seem to bear out assertions 
that destroyer duty requires of an officer 
rapid learning and assumption of responsi- 


e eae 

5 The results to be presented must be interpreted 
in terms of the status quo at the time of this investi- 
gation. To the extent that the operational, training; 
or personnel factors in the Navy change conditions, 
the values obtained might be altered. 

®In all computational work the “X” response, 
meaning “more than 12 months,” 
value of 13; the lower the value, 
relevance of the incident to the OC 


was assigned 4 
the greater the 
S curriculum. 


Curriculum Assessment with Critical Incidents 333 
Table 2 
Frequency Distribution of Incidents by Curriculum Area, Subcategorized as “Taught” and “Not Taught,” 
With Mean of TESP Values for the Incidents in Each Subcategory Indicated $ 
Taught Not Taught Total 
No.of %of No.of %of No.of %of 

one Inci- Inci- Mean Inci- Inci- Mean Inci- Inci- Mean 
urriculum Area dents dents* TESP dents dents* TESP dents dents* TESP 

1 Seamanship 143 16.3 4.34 46 5.2 4.28 189 21.5 4.32 

2. Orientation and 

Military Justice 300 34.2 2.73 9 113 292 399 454 2.78 

3. Navigation 8 0.9 5.13 1 0.1 = 2.00 9 1.0 4.78 
4. Operations 77 88 448 67 76 5.00 144 164 472 
5. Naval Weapons 12 1.4 7.30 38 43 6.26 50 5.7 6.51 

6. Engineering and 

Damage Control 2% 30 546 i 24 S59 4 50 548 

7. Military 2 02 100 2 02 700 4 05 4.00 
8. Other Zs = — 39 44 4.32 39 44 432 
Summary 568 64.7 3.62 310 353 433 878 100.0 3.87 

"P inci i .. Eliminated, b han 1 d itted a 
ercentages based on 878 incidents; N/878., Eliminated, became more tos 107 as, Not meluded Because they could 


indicated that they could not clearly interpret the inci 
e judged with confidence as to area 


bility involving manifold skills and knowl- 
edges, even allowing for the fact that all 
ensigns are not expected to become competent 
in all of the 985 incident situations.” The fact 
that the COs and XOs expected the ensigns to 
be able to handle a sizable majority of the 
Situations after only a few months on board 
Seems to lend corroboration to the notion that 
the over-all adequacy of the curriculum is, in 
8eneral, proportionate to the coverage given 
to these situations, even allowing for differ- 
ences in TESPs of different incidents. In 
other words, if the relevant skills, knowledges, 
and attitudes are “taught,” the curriculum 
Conforms to CO-XO judgments. If these are 
Not taught, COs and XOs say, in effect, they 
Ought to be. 


TESP Values of Incidents Assigned to Subject 
Matter Areas 


aq The next analyses deal with the relative 
importance” of incidents assigned to differ- 
ent subject matter areas and the relation be- 
tween the number of incidents found to be 
ae 


the As indicated in the questionnaire instructions, 

to e cSPondent is required to “Suppose [with respect 

fee th incident] that his [the ensign’s] responsibili- 

andis an officer will give him the opportunity to 
lle the situation as shown.” 


of relevance were 16 incidents, 


relevant to a given area and the TESP values 
assigned to those incidents. 

In order to provide a basis of interpretation 
least vitiated by ambiguity, the following 
analyses were restricted to 878 of the original 
985 incidents included in the 10 JOTRC 
forms. Those 91 items, which more than 107% 
of respondents did not answer or indicated 
that they could not clearly interpret, were de- 
leted. Sixteen items, which the OCS experts 
had not been able to assign with confidence 
to relevant subject matter areas, were also 
removed. The numbers of incidents “taught” 
and “not taught” in the six academic courses 
and in military training were counted, and 
the mean TESP was calculated for each of 
these categories and subcategories, as well as 
the “Others,” as shown in Table 2. Among 
the academic courses there is a close cor- 
respondence between the number of incidents 
said to be relevant to a given course by OCS 
instructors and the mean TESPs derived in- 
dependently from the judgments of “impor- 
tance for early usefulness” by the COs and 
XOs. The three academic courses which em- 
brace the largest number of incidents are also, 
in the same order, the courses which are the 
most “important” as indicated by the mean 


334 


TESP values. Of the six, only the courses in 
Navigation and Naval Weapons, the former 
with a small number of cases, are out of order 
with respect to these two quantities. Again 
it would appear that there is substantial paral- 
lelism between number and TESP value of 
incidents when sorted into OCS curriculum 
areas. 

Some indication of the curricular validity of 
OCS academic offerings is shown by the fact 
that the mean TESP for material “taught” is 
lower (more important) than for material “not 
taught.” However the caution introduced 
earlier, because of the large proportion of in- 
cidents said to be related to the material 
taught in a relatively small number of sessions, 
still applies. 

We have gone one step further and pursued 
the analysis of the 568 incidents said to be 
“taught” in the sample of 878 incidents, We 
determined the mean TESP for the incidents 
assigned to each instructional session to which 
one or more incidents were judged to be rele- 
vant. A distribution of these 157 sessions 
against mean TESPs of incidents assigned to 
each session made it apparent that the lessons 
covering the largest number of relevant in- 
cidents also tend to be considered more “im- 
portant for early usefulness” aboard ship. 
Using 28 lessons for which somewhat more 
reliable mean TESPs are available (based on 
10 or more incidents) we again established a 
fair correspondence between “number” and 
“importance” of incidents in a given category. 
The correlation for these 28 lessons was .65.8 

An inspection was made of the content of 
these 28 lessons. It supports the view that of 
all his background, his skills in human rela- 
tions will be those which a new officer will be 
most immediately and most frequently called 
upon to demonstrate. It appears that the 
first thing a junior officer has to learn is how 
to manage his relations with contemporaries 
and seniors. Leadership and Management of 
subordinate personnel show up as next most 
“important” skills. Demands upon his tech- 


nical skills and knowledges are not made until 
later. 


8 The sign of the coefficient has 
to account for the fact that the 
items have lower TESPs, 
ing and interpretation. 


been reversed here, 
more “important” 
and thus permit direct read- 


Albert S. Glickman and T. R. Vallance 


Discussion 


It has been pointed out that the general 
purpose of OCS training is to develop officers 
with general familiarity in subject matter 
areas containing technical problems and with 
ability to supervise technicians, not technical 
specialists. It would appear that officers re- 
sponsible for operating destroyers are in gen- 
eral agreement with that emphasis. They are 
perhaps inclined to give greater emphasis to 
the requirement that he be able to take up 
quickly, and carry on satisfactorily, responsi- 
bilities requiring “know-how” in dealing with 
people rather than technical skill. Critical 
incidents involving such skills are also, by 
far, the kind most frequently reported about 
ensigns. 

Other uses of the results are available. For 
example, we have concentrated comment on 
the curriculum areas or lessons most fre- 
quently the subject matter of relevant inci- 
dents and to which most “importance” is at- 
tributed. We have not spoken much of the 
lessons involving relevance with none of our 
pool of incidents or those of low “importance.” 
Obviously, these would be examined first in 
terms of their contribution to the training 
program and for the possibility of using the 
time given them for instruction on some of 
the incident-behaviors which were highly con- 
centrated in a few sessions of the program 
studied (e.g., leadership). 

The scope of this research has been limited. 
Although the feasibility of the method has 
been demonstrated by embracing the whole 
curriculum as the field of inquiry, insufficient 
data could be brought to bear on many areas. 
More concentrated attacks can be made upon 
specific curriculum problems, using the same 
Procedures. For instance, if it were desired 
to revamp the Navigation offerings, critical 
incidents involving navigation could be re- 
quested from officers on ships and then sorted 
by lessons of relevance. Since number and 
“importance” of incidents tend to be associ- 
ated, it might be possible to eliminate the de- 
termination of TESPs, if one were willing to 
sacrifice some precision. 

Naturally, if any considerable changes in 
the curriculum were to be made, follow-up 
Studies should be conducted to see if the de- 


Curriculum Assessment with Critical Incidents 


sired changes in on-the-job performance came 
about. 

In this exploratory research we have not 
concerned ourselves with such matters as the 
policies, directives, costs and laws governing 
operation of a school, nor with the time, staff, 
techniques, or facilities, necessary or available 
to achieve satisfactory levels of competence 
through training in the various aspects of a 
curriculum. However, they are within the 
purview of the administrative authorities gov- 
erning a training program. 

The case in point here has been the Navy’s 
Officer Candidate School. However, the ap- 
Plicability of the methodology to other train- 
ing and educational settings is apparent. 


Summary 


This research sought to identify those as- 
Pects of the Navy OCS curriculum which 
Were most and least relevant to duties of 
newly commissioned ensigns aboard destroy- 
ers; thus to provide information useful in im- 
Provement of training. 

More than 1,000 critical incidents of ef- 
fective and ineffective performance by de- 
Stroyer ensigns were available from earlier 
research, These were sorted according to the 

CS curriculum area to which each was most 
televant. The “relevant” incidents assigned 
to each area, were then classified as: (a) 

Taught”—currently the subject of specific 
essons of OCS instruction, and (b) “Not 
aught”—pertains to a subject matter area in 
f © curriculum, but not covered due to time or 
acility limitations. 
c Junior Officer Training Requirements 
ecklist was sent to 340 commanding and 
executive offcers of destroyer-type vessels, 
and was completed by more than 300 of them. 
he checklist was prepared in 10 forms, each 
containing approximately 100 incidents. Each 
ga was sent to 30 to 50 officers, with in- 
Tuctions to make a judgment for each in- 


335 


cident as to: “How soon [in number of 
months] after his reporting aboard [directly 
after being commissioned], under normal con- 
ditions, would you expect the new [reserve] 
officer to be able to handle the situation to 
your satisfaction?” From their answers an 
index of “time expectancy for satisfactory 
performance” was determined for each in- 
cident with high reliability. It was assumed 
that the sooner the ensign is expected to 
handle a situation satisfactorily, the more “im- 
portant” it is that the relevant material be 
learned at OCS. 

The findings indicate that the new ensign 
most frequently and most immediately will be 
called upon to draw on background relevant 
to human relations, leadership, and personnel 
administration skills; technical skills are ex- 
pected to be developed later. 

Suggestions are made for utilization of the 
current findings and directions for further re- 


search. 


Received January 9, 1958. 


References 


1. Flanagan, J. C. Research techniques for develop- 
ing educational objectives. Educ. Rec., 1947, 
28, 139-148. 

2. Flanagan, J. C. The critical requirements ap- 
proach to educational objectives. Sch. & Soc., 
1950, 71, 321-324. 

3. Flanagan, J. C. The critical incident technique. 
Psych, Bull., 1954, 51, 327-358, 

4. Glickman, A. S., & Vallance, T. R. An exploratory 
study of the applicability of critical incident 
techniques to the assessment of curricula for 
officer candidate training. USN Bur. Naval 
Pers. Res. Tech. Bull., 1954, No. 54-23. 

. Mahler, W. R., & Monroe, W. H. How industry 
determines the need for and effectiveness of 
training. USA Personn. Res. Sect. Rept., 1952, 
No. 929. 

6. Vallance, T. R., Glickman, A. S., & Vasilas, J. N. 
Critical incidents in junior officer duties aboard 
destroyer-type vessels. USN Bur. N wal Pers. 
Res. Tech. Bull., 1954, No. 54-4. 


uw 


Journal of Applied Psychology 
Vol. 42, No. 5, 1958 


A Personality Inventory Employing Occupational Titles * 


John L. Holland 


National Merit Scholarship Corporation 


In the last two decades the research litera- 
ture concerning the meaning of occupational 
choice—the relation of vocational interests to 
attitudes and personality—has implied with 
increasing clarity that a useful personality in- 
ventory might be constructed from occupa- 
tional or interest test content (3, 4, 12, 15). 
The present experimental inventory represents 
a further testing and application of this hy- 
pothesis. 

Essentially, the Holland Vocational Prefer- 
ence Inventory is a personality inventory 
which uses occupational titles for content. It 
is constructed to give a maximum amount of 
reliable and valid information with a minimum 
amount of testing and scoring time, skill, and 
expense. Many variables usually included in 
interest and personality tests have been in- 
tegrated in the HVPI. These variables are 
assumed to yield a broad range of information 
concerning the $’s personal adjustment, values, 
attitudes, and vocational motivation. The in- 
ventory is self-administering. The Ss record 
their “feelings and attitudes” about occupa- 
tional titles by indicating either “interest” 
(appeal) or “dislike” for each of the 300 
items, 

Because of the relative neutrality of its 
title and occupational content, the inventory 
appears to have a number of uniquely de- 
sirable qualities, Occupational titles Provide 
a subtle set of stimuli which minimize the 
negative reactions sometimes provoked by 
more obvious personality inventories, reduce 
the S’s need to “fake” since the inventory is 
perceived as a “vocational” inventory, and 
virtually eliminate those requests for per- 
sonality interpretations which usually follow 
on the heels of the administration of other 
personality inventories. 


1I wish to acknowledge the support and encourage- 
ment of numerous colleagues and students at Western 
Reserve University, VA Hospital, Perry Point, Mary- 


land, and the University of Maryland, who have 
made this study possible. 


336 


Rationale 


The development of the inventory has been 
guided by a psychological rationale integrated 
from a number of divergent fields: psychology, 
psychiatry, test theory, and sociology. The 
aim of this formulation is to provide a theo- 
retical framework for using and interpreting 
the inventory, and to present a means of 
extending and clarifying its construct validity. 
The following assumptions summarize this 
tentative rationale: 

1. The choice of an occupation is an expres: 
sive act which reflects the person’s motivation, 
knowledge, personality, and ability. Occupa- 
tions represent a way of life, an environ- 
ment rather than a set of isolated work func- 
tions or skills. To work as a carpenter means 
not only to use tools but also to have a cer- 
tain status, community role, and a special 
pattern of living. In this sense, the choice of 
an occupational title represents several kinds 
of information: the S’s motivation, his knowl- 
edge of the occupation in question, his insight 
and understanding of himself, and his abilities. 
In short, item responses may be thought of as 
limited but useful expressive or projective 
protocols. 

2. The interaction of the person and his 
environment creates a limited number of fa- 
vorite methods for dealing with interpersonal 
and environmental problems. The various 
HVPI scales are assumed to measure, some 
of these common or favorite methods of ad- 
justment. Translated into scale terms, peaks 
reveal the person’s favorite methods whereas 
low points indicate the rejected methods of 
adjustment. Or, peaks may represent desir- 
able roles and situations, and low points, 
threatening or distasteful roles and situations. 

The foregoing assumption js predicated by 
another assumption; namely, that the various 
classes or occupational groups furnish dif- 
ferent kinds of gratifications or satisfactions 
and require different abilities, identifications, 
values, and attitudes. This particular hy- 


A Personality Inventory 


pothesis has extensive empirical support from 
studies relating “vocational interests” to per- 
sonality variables, psychiatric status, values 
and attitudes. Typical studies in this area 
include those of Berdie (1), Darley (2), Forer 
(3), Garman (4), Gough (5), Laurent (7), 
Sternberg (11), Terman (13), and Weir (15). 

3. The development of adequate adjustive 

techniques requires accurate discrimination 
among potential environments. The ability 
to discriminate potentially satisfying and 
beneficial environments from potentially dis- 
satisfying and unhealthy environments is im- 
perative for creative health. In this sense, the 
inventory is a miniature performance test of 
the S’s understanding of his surroundings in 
relation to himself; that is, the choice of an 
occupational title is a measure of the S’s in- 
sight and understanding as well as a sign of 
his motivation and his intellectual comprehen- 
sion of the occupation in question. This third 
assumption has two corollaries which seem 
useful for interpreting “responsiveness” to the 
Inventory. 
_ 3a, The total number of preferred occupa- 
tions is a function of a number of personality 
Variables. These hypothesized variables are 
Supported by some of the evidence for the 
validity of the inventory, and by several scat- 
tered studies reported by Berdie (1), Weir 
(15), and others. 

Specifically, the total number of preferred 

Occupations is a function of dependency, a8- 
8tessiveness, mood, degree of cultural intro- 
ception, self-control, sociability, and defensive- 
Ness. Over-responsiveness suggests a lack of 
adequate discrimination which may be re- 
ected in dependence, aggression, euphoria, 
Over-intraception of the culture, impulsivity, 
Sociability, frankness. In contrast, under- 
Tesponsiveness appears indicative of greater 
independence, passivity, depression, rejection 
Of the culture, over-control, withdrawal, and 
efensiveness. 

3b. The inability to make discriminations 
among occupations is indicative of conflict 
T disorganized self-understanding. — Just as 

e inability to make everyday decisions 1s a 
result of conflicting motivations, so the in- 
ability to make positive or negative choices of 
Occupations (environments) within the in- 


337 


ventory is a sign of conflict. In this sense, 
conflict is defined as divergent, inaccurate, or 
irreconcilable views about one’s abilities, 
needs, and sources of gratification; and is ac- 
companied by the chronic emotional upset 
which results from such conflict. In test 
terms, inability to make choices is reflected in 
the total number of omitted items. 

4. Interest inventories are personality in- 
ventories. Interest and personality inventories 
are identical in principle and provide similar 
information about the person, although their 
content is quite diverse. Both kinds of in- 
ventories reveal how the S perceives himself 
and his milieu. 


Construction of the Inventory 
Preliminary Forms 


The first form of the inventory was estab- 
lished by constructing eight a priori scales: 
Physical Activity, Intellectuality, Responsi- 
bility, Conformity, Verbal Activity, Emotion- 
ality, Reality Orientation, and Acquiescence. 
These scales were devised by reviewing the 
interest and vocational choice literature with 
reference to personality factors, and by using 
this information to create personality formula- 
tions for each scale. With the formulations 
as a guide, the author constructed the scales 
by selecting occupational titles which pre- 
sumably epitomized the scale rationale. For 
example, the Physical Activity scale consists 
of occupational titles which imply motoric 
activities and skills. Typical titles in this 
scale include machinist, North Woods guide, 
forest ranger, electronic technician, etc. Scales 
are scored by counting the number of “pre- 
ferred” occupations in a given scale. 

The second form consists of a revision of 
the first form by means of an internal con- 
sistency analysis of the individual scales and 
of six added scales, forming a 14-variable 
test. The additional scales were Control, Ag- 
gressiveness, Mf, Status, Heterosexuality, and 
Infrequency. Nest, all scales were intercor- 
related and a cluster analysis was performed 
in order to eliminate scales and clarify scale 
meanings. This analysis suggested that 10 
scales were sufficiently independent to merit 
retention and further revision. 


338 


Present Form 


third form was secured by performing 
geal consistency analysis for those 10 
scales which survived the cluster analysis of 
the second form. This analysis was accomp- 
lished with two samples of 300 male and fe- 
male college freshmen. The method of analy- 
sis was to compare the upper and lower 25 
per cent of each sample on each scale. The 
most discriminating items were then selected 
for the final scales, using the “Kelly Tech- 
nique” nomographed by Lawshe (8). 

The Masculinity-Femininity, Status, Infre- 
quency, and Acquiescence Scales are excep- 
tions to the construction method outlined 
above. These scales have a direct empirical 
development. The Status Scale was devised 
by reviewing three studies of occupational 
status which used the rank order method (6, 
9,10). Scale items were derived by making 
all items satisfy the data of all three studies, 
For example, “Physician” occurs as a high 
status choice in each study. The final scale 
was then keyed so that preferences for occu- 
pations of high status and aversions to occu- 
pations of low status result in a high status 
score, 

Construction of the Mf Scale began with a 
comparison of the Tesponses of 170 male and 
female high school students on the second 


John L. Holland 


form. This scale was then revised by a com- 
parison of two groups of 150 male and female 
college freshmen. s 

The Infrequency Scale consists of 25 items 
which are rarely “liked” and 25 items which 
are rarely “disliked.” It was constructed 
from frequency counts of the 300 test items 
for samples of 119 male and 100 female col- 
lege freshmen. The 25 most popular and 25 
least popular items were selected from each 
sample to form scales by reversing the car- 
ing; that is, popular items are scored “dis- 
like” and unpopular items are scored “like. 
Accordingly, high scores represent unusual 
or unpopular choices, whereas low scores 
represent popular choices. ; 

The Acquiescence Scale provides an esti- 
mate of the S’s acquiescence, his tendency to 
respond “Appeal” or “Like.” It is scored by 
counting the number of “Appeal” responses 
for Items 1 through 30. Although a more ac- 
Curate estimate of acquiescence can be ob- 
tained by counting the “Appeal” responses 
for all test items, Items 1-30 provide a useful 
and economical estimate since Items 1-30 
correlate .74 with Items 31-300 for males and 
-69 for females. 

The third and present form of the inventory 
Contains 13 scales in all: 3 response-set scales 
and 10 personality scales. The response-set 


Table 1 


Scale Matrix for Male and Fe 


male College Freshmen 


Scales 

Leen MPC Lye” “ne ac Ag Mf Sst Inf Ac 
Tn 43 e r EE ae an 2 Ce: 
Re 05 16 eae | Ogee a Bi ee gg BE 
ci 7 05) at Be ISP ROL © 30" 415 «GR Sg. ™ BE 
VA 35m 05.5) Mes a 2 Bi Os 3g at 8 
Em 35 70 SARAN oa ion a i fay. 5 
Co re o OD a oe =02) =27 ‘ 95° 30 
‘Ag Se aE ae as yg uw S52) ee 12 
Mf ee Sa, s 66 20. O2e Wai 
St A UN ie T2 ie 40 -17 485 
Tnt Pee. 48 s oe O S CT 37 
Ke Se ae Sa a eee 


Note,—r, of .30 is significant at .05 level, 
in upper right triangle, 
* Since the Physical A 


ctivity items for femal 
scale; consequently, the P] 


v ns | les are homo; 
hysical Activity scale 


Intercorrelations for males are show: 


F ales, 
n in lower left triangle of matrix; for femal 


ri tter 
k e geneous with the I lity i i Jed in the lal 
is omitted irti the Intellectuality items, they are includ 


he female form of the inventory. 


A Personality Inventory 


scales include the: 1. Question Scale, or the 
number of omitted items, 2. Infrequency 
Scale, and 3. Acquiescence Scale. The re- 
maining scales are: 1. Physical Activity, 2. In- 
tellectuality, 3. Responsibility, 4. Conformity, 
5. Verbal Activity, 6. Emotionality, 7. Con- 
trol, 8. Aggressiveness, 9. Masculinity-Femi- 
ninity, and 10. Status. 

To secure two equivalent parts or tests, 
each scale has been halved and the halves 
separated to form two tests. This procedure 
establishes a useful format for determining re- 
liability and presents an opportunity for de- 
termining the value of a short form. Table 1 
shows the scale interrelationships for college 
male and female samples of 100 each. For 
males, the intercorrelations range from —.60 
to .83, and average .36. For females, 7’s 
range from —.44 to .64 and average .26. 


Age and Intelligence * 


The relation of HVPI scales to age and in- 
telligence has been tested, since it is desirable 
for interpretative purposes to have the in- 
ventory relatively independent of these vari- 
ables. Heterogeneous samples of high school 
and college students and of employed adults 
Were combined to increase the age range in 
Correlating age with HVPI scales. Males 
Tange in age from 15 to 77; females range 
from 17 to 60 years. In general, these rela- 
tionships are either insignificant or of a low 
order, The correlation of —.64 between Con- 
trol scale and age for females is the exception 
to these findings. 

The relation between intelligence and HVPI 
Scales was estimated by correlating the Won- 
derlic Personnel Test, a revision of the Army 
Alpha, with the HVPI scales; samples of em- 
Ployed hospital attendants were used. None 
Of the resulting correlations are significant; 

Owever, the range of Wonderlic scores is 
Somewhat attenuated so that significant rela- 
tionships may occur in more heterogeneous 
Samples, 

Tables; izi ion of the inventory 
Scales a yan eee scale reliabilities 
amare deposited with the American Documenta- 

stitute. Order Document No. 5689, remitting 


25 for 35-mm microfilm or $1.25 for 6 by 8 in. 
Photocopies, 


339 


Reliability * 

The internal consistency of the third re- 
vision was estimated by correlating Part I 
against Part II of each scale, using samples 
of 100 male and female college freshmen. The 
scale divisions are assumed to be equivalent, 
since items were assigned to Part I or II ran- 
domly. For males, these coefficients cor- 
rected for length range from .72 to .95, with 
a median of .85. For females, they range 
from .68 to .90, with a median of .79. 

An estimate of retest reliability was ob- 
tained from data supplied by Walsh (14), who 
tested a sample of 38 TB patients before and 
after lung surgery. The time interval be- 
tween testings averaged about four months. 
The retest correlations range from .70 to .87, 
with a median of .75. 


Validity 

The inventory differentiates significantly a 
number of defined criterion groups which 
represent a wide range of personality differ- 
ences. In general, the obtained differences 
support the construct validity of the inven- 
tory. These findings are summarized in the 
following sections. 


Table 2 


Mean Scale Differences for Matched Controls 
and Psychiatric Patients 
(Age, Occupational Class, and Status) 


Controls Patients 

(N = 100) (N = 100) 
Scale x SD x SD P. 
Inf* 132 S9 18.2 7.4 <.001 
Ac 125 5:8: 10.4 6.9 <.05 
PA 15.0 7.9 10.9 8.7 <.01 
Int 10.5 7.6 87 N8 <.10 
Re 9.5 6.2 76 13 <.06 
cr 10.4 6.7 75 8i <.01 
VA 10.1 6.6 FS AA <.01 
Em 81 7.9 8.2 82 — 
Co* io Ja 13.3 5.8 <.07 
Ag 13.3 84 10.0 8.9 <.01 
Mf 12.3 4.4 99 40 <.001 
St 10.1 4.5 99 50 — 


* Variances significantly different beyond .05 level, 


340 John L. Holland 


Table 3 


Distributions of Profile Peaks for Controls 
and Psychiatric Patients 


Controls Patients 

Scale (N = 183) (N = 186) 
PA 38 23 
Int 5 8 
Re 19 17 
Cf 41 27 
VA 1 1 
Mf 19 8 
St 1 9 


Note.—For this comparison, the control and patient samples 


d Ss to these groups 
in order to increase the “expected” cell frequencies, These 
additional Ss closely approximate the matched samples with 


Controls and Psychiatric Patients 


For this comparison, male samples of 100 
controls and 100 psychiatric patients were 
matched for age, socioeconomic Status, and 
principal occupation. The group distributions 
for these matched variables are not signifi- 
cantly different. The contro] group consisted 
of employed and unemployed adults tested in 


the process of employee selection and similar 
personnel studies at a VA psychiatric hospital. 
The patient group included 74 psychotic and 
26 nonpsychotic patients, all without organic 
involvement. Most of them were tested near 
the end of their hospitalization, although a 
few were tested shortly after admission. | Table 
2 summarizes the results of the comparison. 

The significant differences in Table : are 
in the expected directions; that is, a literal 
interpretation of the scale meanings yields 
descriptions for normals and for psychiatric 
patients which are congruent with stereotypes 
of normality and psychiatric status. These 
differences reveal that normals give common 
responses (Inf), are more responsive (Ac), 
are physically active (PA), conforming (Cf), 
verbal (VA), aggressive (Ag), and masculine 
(Mf). In contrast, patients give infrequent 
or unusual responses (Inf), are less responsive 
(Ac), less active (PA), nonconforming (Cf), 
less verbal (VA), more passive (Ag), and 
more feminine (Mf). 

Similar differentiation is obtained between 
control and patient samples when distribu- 
tions of high point scales are formed. For 
this comparison, individual profiles were clas- 
sified by the highest scale in the profile. The 
resulting distributions for patients and con- 


Table 4 


Mean Scale Scores for Controls, Ps 


ychiatric Patients, TB Patients, 


and Psychopaths 


Control 


Seal Psychopath 
cale 


«FB Psychiatric 

N = 100) (N = 50) (N = 61) (V = 100) E 
? 25.2 51.5 16.2 30.9 t 
Iní 13.2 14.5 18.9 18.2 13.47*** 
Ac 12.5 10.3 11.3 10.4 2.02 
PA 15.0 12.0 13.0 10.9 3.88** 
Int 10.5 8.9 8.9 8.7 .96 
Re 9.5 6.9 7.0 7.6 3.03* 
Cf 10.4 6.9 9.3 vs 4.16** 
VA 10.1 7.7 8.1 7.3 2.62 
Em 8.1 9.1 8.5 8.2 18 
Co 11.6 11.2 13.0 13.3 2.09 
Ag 13.3 9.3 10.6 10.0 3.29* 
Mf 12.3 10.4 10.9 9.9 §.57*** 
St 10.1 10.0 


ps for these variables. Note insi 


d for age, stat 


9.9 9.9 04 
Note.—Controls and psychiatric patients were matche 
approximate fret two grou 
.05. 


nificant differe 


us, and occupation. 


ts 
Psychopaths and TB patien 
nces on status scale, 


A Personality Inventory 341 


z Table 5 


Significant Mean Score Differences 


M a a a —— 
EEE 


Scale Criterion Groups Sig Level 
Infrequency NP, TB > C, Pd <.001, <.01 
Physical Activity Ci NP; Pa Ki, gos 
Responsibility C > Pd, TB <.05, <.05 
| Conformity C > Pd, NP <.01, <.01 
ra Mf C > Pd, NP, TB <.05, <.001, <.05 
Aggressiveness C> Fd, NP <.01, <.01 


O 


Table 6 
Distributions of Profile Peaks for University of Maryland Colleges Entering Male Freshmen 
(N = 382) 
a  ___.____ eee 
” Arts and 
Science Engineering Business Education Agriculture 
Scales (N = 101) (N = 120) (N = 96) (N = 38) (N = 27) 
Physical Activity 15 33 3 11 11 
Intellectuality 25 39 7 2 9 
Responsibility 22 5 18 13 1 
Conformity 9 13 41 6 4 
Verbal Activity 10 8 15 2 1 
v Emotionality 20 22 12 4 1 
Entering Female Freshmen 
(N = 119) 
Arts and Home ; f 
Science Economics Business Education Nursing 
Scales (N = 59) (N = 46) (N = 17) (N = 58) (N = 19) 
Intellectuality 14 4 3 5 7 
Responsibility 10 4 1 20 8 
Conformity 14 12 10 18 3 
i Verbal Activity 12 11 > 10 0 
i Emotionality 9 15 1 5 1 


Note. —F i ns of scale peaks for Arts and Sciences, Engineering, and Business are significantly different 
ey the .01 and ool ps, datribet orais. For females, distribution for Arts and Sciences, Home Economics, and Education are 
empacantly different at the .10, .06, and .01 levels, respectively. For this comparison, only the first six inventory scales were 

Oyed, 


These samples are assumed to have a crude 
comparability with respect to age, socioeco- 


nomic status, and intelligence. The mean ages 
for the four groups range from 27.0 to 33.2 


trols, shown in Table 3, are significant beyond 
the .001 level. 


N ormals, Psychopaths, Tuberculosis, and Psy- 


Chiatric Patients 


The study reported above was extended by 
comparing the matched control and psychi- 
atric samples with two additional samples of 

! TB patients and 50 criminal psychopaths.° 


B * The prison psychopaths were selected by Lloyd 
alkins, psychologist for the Maryland peni- 


years; the four means on the status scale 
range only from 9.9 to 10.1 raw score points; 
there is a similar distribution of principal oc- 
cupations in each of the four groups. 


tentiary. All Ss in this sample were diagnosed by a 
test battery including, at a minimum, the MMPI, 
W-B, and DAP. 


342 John L. 
The group means were tested first by a 
simple analysis of variance; then individual ¢ 
tests were performed for the significant analy- 
sis of variance tests. The results of these 
analyses are shown in Tables 4 and 5. 


College Choice 


An additional study attempted to differenti- 
ate the profiles of university freshmen on the 
basis of their college choice. The sample in- 
cluded groups of male freshmen majoring in 
business, arts and sciences, and engineering. 
For the analysis, 209 profiles were classified 
by high point scales among the first six scales 
of the second revision and frequency distribu- 
tions formed for each college. x? tests re- 
vealed that differences between any pair of 
colleges are significant beyond the .05 level. 

‘The following year, this test was replicated 
with a sample of 317 freshmen, using the 
third revision of the inventory. Two of the 
differences among scale peak distributions are 
significant at the .001 level and a third at the 
01 level. In addition, these differences are 
substantial. The engineers, for example, have 
60 per cent of their peaks among the Physical 
Activity and Intellectuality scales, whereas 
the business freshmen have only 10.4 per cent 
of their peaks among these scales, Likewise, 
the business freshmen have 77 per cent of 
their peaks among the Responsibility, Con- 
formity, and Verbal Activity scales, whereas 
the engineers have 22 Per cent of their peaks 
among these scales. Table 6 summarizes this 
evidence along with several small samples not 
amenable to statistical test, 


Received January 15, 1958. 


References 


1. Berdie, R. F, Likes, dislikes, 
terests. J. appl. Psychol., 
2. Darley, J. G. A preliminary 
between attitude, 


and vocational in- 
1943, 27, 180-189. 
study of relations 
adjustment, and; vocational 


Holland 


wn 


10. 


il. 


12. 


13. 


14. 


15. 


interest tests. J. educ. Psychol., 1938, 29, 467- 
473. 


- Forer, B. R. Personality dynamics and occupa- 


tional choice. Amer. Psychologist, 1951, 6, 
378-379. (Abstract) 

Garman, G. D. The Strong Vocational Inter- 
est Blank as a measure of manifest anxiety. 
Amer. Psychologist, 1954, 9, 372-373. (Ab- 
stract) 


- Gough, H. G., McKee, M. G., & Yandell, R. J. 


Adjective check list analyses of a number of 
selected psychometric and assessment variables. 
Berkeley: Inst. Pers. Assessment and Research, 
Univer. Calif., 1955. (Mimeo.) 


- Hall, C. W. The prestige values assigned to a 


group of 249 occupations by 200 independent 
raters. Western Reserve Univer. (Mimeo.) 


- Laurent, H. A study of the development back- 


grounds of men to determine by means of the 
biographical information blank the relation- 
ship between factors in their early back- 
grounds and their choice of professions. Un- 
published doctoral dissertation, 1951, Western 
Reserve Univer. 


- Lawshe, C. H., Jr. A nomograph for estimating 


the validity of test items. J. appl. Psychol. 
1942, 26, 846-849. 


. National Opinion Research Center. Jobs and 
occupations: A popular evaluation. Opinion 
News, 1947, 9, 3-13. 

Randall, H. Occupational Prestige. Occupa- 


tional Planning Committee, Cleveland Welfare 
Federation, 1951. (Mimeo.) 1 

Sternberg, C. Personality trait patterns of col- 
lege students majoring in different ae 
Psychol. Monogr., 1955, 69, No. 18 (Whole 
No. 403). 

Strong, E. K. Jr. 
and women. 
1943. 


Terman, L. M. Scientists and nonscientists a 
group of 800 gifted men. Psychol, Monogr» 
1954, 68 (Whole No. 7), at 

Walsh, R. P, Vocational interests, their Stal 3 
ity and personality correlates in hospitalizes 
tuberculosis patients. Unpublished maste 
thesis, Univer, Maryland, 1956. Prl 

Weir, J. R. An attempt to identify vorin 
interest profiles within a neuropsychia ia 
population. Unpublished doctoral disser! 
tion, Univer. California, 1951. 


Vocational interests of m 
Stanford: Stanford Univer. Press, 


Journal oj Applies y " 
aa f op ied Psychology 


Length of Work Periods in Visual Research * 


Miles A. Tinker 


University of Minnesota 


The duration of the task employed in in- 
vestigating visual efficiency has varied consid- 
erably from one reported experiment to an- 
other. In general the tendency has been to 
use relatively short work periods, i.e., one to 
five minutes. The reliability and validity of 
Measurement as related to length of work 
Period is an important issue. In an earlier 
study by Tinker (3), it seemed that a pro- 
longed reading task, such as 10 minutes, was 
necessary to achieve valid results in certain 
Critical comparisons where differences tend to 
be small. One of the difficulties in this ear- 
lier study (3) was that results for brief work 
Periods in a 1928 investigation were com- 
Pared with prolonged work periods in the 
1955 report. It now seems obvious that such 
Comparisons should be made for work pe- 
Nods by the same group of subjects. 
ae the 1955 study, Tinker (3) did show 
itali the results were the same in comparing 
of i with roman lower case printing 1n each 
AT ree successive 10-minute work periods 
Genie a 30-minute work period. Similar 
with s were found in comparing lower case 
cessi all-capital printing in each of four suc- 
af ive four-minute work periods and in a 16- 
Sy ae work period. The italic was read sig- 
cantly slower than the roman and the all- 
pital material was read significantly slower 
Ber lower case printing. Reliability of meas- 
ae was fairly high: a mean of .84 for 
sult and .87 for the other study. | These re- 
ae therefore, indicate that 10 minutes Is as 
D Isfactory as 30 minutes in the former com- 
7 rison and that four minutes is as satisfac- 
W as 16 minutes in the latter comparison. 
fo; now desirable to explore results obtained 
a work periods, ie, from 13 to 10 

Utes, 
here present experiment was designed to in- 
igate the use of relatively short work pe- 


1 
So! ee Writer is grateful to the University of Minne- 
this arenes School for a research grant to finance 
y. 


riods in one type of visual research. It ap- 
pears that the length of work period that 
needs exploring ranges from 14 to 10 minutes. 
In the study (3) cited above it is clear that 
10 minutes was as satisfactory as longer pe- 
riods in one comparison (italics vs. roman), 
and, similarly, 4 minutes was satisfactory in 
the other (all-capitals vs. lower case). The 
present experiment is concerned with speed of 
perception in reading under various levels of 
illumination. 


Materials and Procedure 


Tinker’s Speed of Reading Test (2) was employed. 
Since comprehension is constant, the test measures 
virtually speed of perception in reading as a single 
variable. The two forms of the test are approxi- 
mately equivalent. Both were set in regular (ro- 
man) 10-point Excelsior type face with 2-point lead- 
ing in a 20-pica line width on eggshell paper stock. 
The 180 university sophomore Ss were tested indi- 
vidually in a light laboratory which provided well- 
dispersed indirect illumination. In Group A, the 
Control Group, Forms I and II of the test were 
given under 25 foot-candles of light. In Group B 
Form I was given under 5 foot-candles and Form II 
under 25 foot-candles of light. And in Group C Form 
I was administered under 25 and Form II under 200 
foot-candles of light. The time limit was 10 minutes 
for each form, with marks on the test to indicate the 
end of 1} and 5 minutes of work. Successive S’s for 
Groups A, B, and C were systematically rotated, i.e., 
S No. 1 was put in Group A, No. 2 in Group B, 
No. 3 in Group C, No. 4 in Group A, etc. 


Results and Discussion 


The results are given in Table 1. Inspec- 
tion of the table reveals the following trends: 
(a) The reliability of measurement is high, 
varying from .82 to .97 with a median coeffi- 
cient of .95. (b) In Group B the results show 
that similar trends occurred with 14, 5, and 
10 minutes of reading, i.e., there was no sig- 
nificant difference in performance under 5 vs. 
25 foot-candles of light. (c) In Group C there 
was no significant difference in performance 
under 25 vs. 200 foot-candles of light with 
a work period of 14 minutes. With the 5- 
minute and 10-minute work periods, the read- 


343 


Miles A. Tinker 
344 
Table 1 
A “ i ing Time 
inati Speed of Reading With Varying Lengths of Reading 
Becton Uinga Ww 2 60 in each test group, 180 in all) 
Diff. Between 
Means in 
Test Form Diff. 
i and Para- Per SEDE 
aoe Limit Foot-candles Mean SD graphs* Cent d S.E. Dif 
5 6) 0 (8) (9) 
1 (2) (3) (4) (3) ( 
a First I, 25 13.62 4.05 0.0 0.0 83 0.00 
13 min II, 25 14.12 4.02 
A First D 25 46.77 11.28 0.0 0.0 95 0.00 
5 min II, 25 46.62 12.09 
A First I, 25 94.97 22.97 0.0 0.0 97 0.00 
10 min I, 25 95.42 26.36 
B First E IS 12.87 3.21 +0.25 +19 82 1.01 
1 min I, 25 13.62 3.19 
B First i s 44.27 9.58 +018 +04 92 0.37 
5 min Il, 25 44.30 9.99 
B First rs 89.80 20.58 -0.05 —0.1 95 0.06 
10 min, I, 25 90.20 21.58 
Cc First I, 25 14.40 3.74 -0.10 —0,7 86 0.35 
1} min II, 200 14.80 4.34 
c First I, 25 48.92 10.57 -140 ~29 95 2.55 
5 min II, 200 47.37 12.49 
c First I, 25 99.63 22.71 3.07 ~3.1 97 3.55 
10 min II, 200 97.02 25.60 
^ The differences in 


ing was retarded significantly 
foot-candles of light, at the 2 
levels, respectively, 

The results prese: 
minutes of readin 


under the 200 
and 1 per cent 


It might 
| 1), page 
in typography 


be noted that Paterson and Tinker ( 
178, found similar trends 
studies, i.e., 13, 5%, and 10 minutes working 
time all yielded comparable results. For 25 
vs. 200 foot-candles the trend is different as 
noted above. 

The fact that speed of 
in reading decreased under 
candles for work periods of 
needs some interpretation, 


visual perception 
25 vs. 200 foot- 
5 and 10 minutes 
It is suggested 


he “corrections” amount to —0.50 


m Il 
Form 
difference between the mean scores of Form I and F oor the 
for the first 14 minutes, +0.15 
lecimal places, 


were carried to 4 d 
here that possibly glare was operating to K 
duce efficiency under the 200 footcandles pi 
light. It is possible that the 14-minute WO" Š 
period was too short for the glare to den 
efficiency but that the glare did become effe 
tive under the longer work periods of 5 e 
10 minutes. Note in Table 1 that the 1 
crease in efficiency was greater for the J 
than for the 5 minutes, Similar trends ha $ 
appeared for high light intensities in anothe 
study not yet published, 


Summary and Conclusions 
š 0- 
1. The purpose of this experiment is to ae 
vide additional information on the effec 
length of work Periods in visual research. 


y 


Length of Work Periods in Visual Research 345 


2. Work periods of 14, 5, and 10 minutes 
were used while investigating the relative effi- 
ciency of speed of perception in reading under 
5, 25, and 200 foot-candles of light. 

3. There were 180 university sophomore 
Ss who were tested individually under well- 
controlled indirect illumination. 

4. The three work periods yielded the same 
results for the 5 vs. 25 foot-candles compari- 
son, i.e., there was no significant difference 
in performance with change in illumination. 
With the 25 vs. 200 foot-candles comparison, 
there was no significant difference for the 13- 
minute work period but speed of reading was 
significantly slower under the 200 foot-candles 
for both the 5- and 10-minute work periods. 


It is suggested that glare was having a dele- 
terious effect under the high illumination for 
the longer work periods. 

5. One may tentatively conclude from the 
above results that work periods as short as 14 
minutes may be safely used in studying the 
relation between illumination and visual effi- 
ciency. 

Received January 17, 1958. 


References 


1. Paterson, D. G., & Tinker, M. A. How to make 
type readable. New York: Harper, 1940. 

2. Tinker, M. A. Tinker speed of reading test. Min- 
neapolis: University of Minnesota Press, 1955, 

. Tinker, M. A. Prolonged reading tasks in visual 
research. J. appl. Psychol., 1955, 39, 444-446. 


we 


al of Applied Psychology 
Pas, No. 5, 1958 


Communication Restraints, Group F lexibility, and Group 
Confidence? 


Robert C. Ziller 


Fels Group Dynamics Center, University of Delaware 


Interpretations of the results of experiments 
concerned with group problem-solving produc- 
tivity are necessarily complicated by qualifica- 
tions involving unique task demands (7). In 
an attempt to avoid this limitation, the study 
reported here concerns the correlates of a 
postulated fundamental dimension of group 
problem-solving behavior, 
(5). More explicitly, the Present study ex- 
plores the relationship between selected group 
characteristics and group flexibility opera- 
tionally defined as the ability of a group to 


reorganize to meet the time demands of a new 
situation. 


Problem 


In general, it was Proposed that the more 
flexible groups are tho: i 


less communication restraints, 
The following characteristics of the leaders 


and groups were selected as independent vari- 
ables, the F. 


leader, and the attractio 


gan, 1955. Special acknowledgment 
corded W. Clark Trow, 
Berkowitz, and E. Paul Torrance for their i 


critical reading of the original manuscript, 
2 Thi 


of the author. 
flecting the views or indorsement of the De; 


346 


focal person, thereby restricting information 
exchange and group flexibility, 

An attractive group is described simply as 
one in which the sroup members feel secure, 
feel as if they “belong,” know what to expect 
of others and can communicate with them 
without restraint (3). There is evidence (6) 
that members of highly attractive groups tend 
to deviate from the group norms when the 
norm is discordant with reality. In such 
groups, the members are more disposed to 
Critically evaluate established procedures in 
the light of new circumstances and evolve an 
organization more in accord with the demands 
of the given situation, 

Finally, it was anticipated that the extent 
to which the leader is willing to modify his 
judgment toward the group norm (conform- 
ity) is related to group flexibility through the 
intervening variables of the relative openness 
of the group’s communication network. It 


Experimental Procedure 
Subjects 


Altogether, 96 B-29 or B-50 aircrews com- 
Prising about 1,000 men were involved in the 
experiment. For the most part each crew 
Was composed of 11 men of whom five were 
Officers (including the assigned aircraft com- 
mander or leader) and six were airmen, How- 
» Some of the crews were as small as eight 
men and others as large as 15, All subjects 

ad been members of the same crew for 4 
minimum of three months. Some members 
had been together for as much as a year. 


Procedure 


Immediately Precedin, 


k g the problem-solving ses- 
Slons, the members ofa 


group completed sociometric 


Ez 


p 
h, 


Communication Restraints 


and group attraction questionnaires. Then they 
Seated themselves around a rectangular area and the 
general nature of the group task was described. The 
directions read, in part, were: 

The object of the test is to make as high a crew 
Score as possible; that is, the highest score that your 
crew can produce using any means at your disposal. 
The Way you go about this is entirely up to you. 
You may hand in one answer sheet for the crew, an 
answer sheet for each crew member, or any other 
arrangements you wish to make. When two or more 
answer sheets contain answers to the same question, 
they will be averaged to provide a crew score for 
that question,” 

Any questions arising from the group were an- 
Swered by rereading the directions. At this point a 
Questionnaire purported to be a test of the indi- 
Vidual’s ability to estimate accurately the group’s 
Performance was administered. Actually, the ques- 
lonnaire provided the data for the measurement of 
Broup confidence. Then the members were allowed 
one minute in which to examine a sample exercise, 
and then sixteen minutes in which to complete all 
fee exercises comprising the group task. During 

eir performance, an observer recorded the group’s 
ened of approach to the task. Immediately fol- 
owing the problem-solving session, the crew mem- 


ers completed the F scale and the “conformity” 
Scale,+ 


The Group Performance Test 


c z n the experiment, each crew member re- 
azed a copy of Test 401-B, one of the tests 
ae Intellectual Talents Battery being de- 
S Oped for use in the Air Force by the Per- 
ae Research Laboratory, Air Force Per- 
l nnel and Training Research Center, Lack- 
and Air Force Base, Texas. The test booklet 
Consisted of eight rather long exercises. In 
aes exercise, a problem was raised and 15 
acts presented that had a bearing on the solu- 
ton to the problem. The Ss were instructed 
R Choose five of the 15 facts most important 
i Teaching a decision on the problem. The 
in pendent selected the most important facts 
is] deciding whether a certain enemy-held 
Sand base was being developed as an air 
ae or a submarine base, what type of home 
5 Married officer should buy, what man a 
‘quadron commanding officer should recom- 
ey for Officer Candidate School, what city 
fe" be selected for a recruiting office, what 

a manufacturer should choose for a fac- 

* This igi p é 

Alvin Zanes, Seine 2, foun EE, 


stupeTsity of Michigan, and was used in the present 
Y at his suggestion. 


347 


tory, which of two devices to train radar ob- 
servers should be purchased by the Air Force, 
which secretary should be hired, and which 
of two types of visual aids should be used by 
Air Force schools. 

Because of the complexity of the decision- 
making situation, it was impossible for an in- 
dividual or a discussion group to complete all 
eight problems efficiently within 16 minutes. 
The task was completed most expeditiously 
by splitting the group into two or more sub- 
groups and apportioning the problems among 
them." 

Unpublished reports describing the develop- 
ment of the instrument indicate that measures 
of its interval consistency by the split-half 
technique were low (r = .37). 


Measuring the Variables 


F Scale 


Two items were eliminated from the original 
F Scale when it was observed that Air Force 
personnel were openly derisive regarding the 
one, while the other appeared outdated. Re- 
ported reliability measures of the original in- 
strument range from .81 to .91 with an aver- 
age of .87 (1, p. 251). Scores were tabulated 
and divided into high and low categories by 
dividing the distribution at the median. 


Conformity of the Leader 

The measure of conformity included the fol- 
lowing items: 

Supose you found yourself holding an opinion 
which was not in agreement with an opinion held by 
the other people in your crew and this opinion was 
on an issue of importance both to you and to the 
crew. In each set of statements below, check the one 
alternative which would best express how you would 
feel and what you would do if such a situation 
existed in your crew. 

1. I would feel: (check one) a. Not at all con- 
cerned about my differing from the crew opinion, 
b. Slightly concerned about my differing from the 
crew opinion, c. Moderately concerned about my 
differing from the crew opinion, d. Greatly con- 
cerned about my differing from the crew opinion. 

2. If I found that my opinion differed from that 
of this crew, I would feel: (check one) a. Just as 
sure of my own opinion, b. Less sure of my own 


5 The methods used by the crews in the present 
experiment were anticipated through observations of 
aircrew performance on similar tasks (13). 


348 


opinion, c. Much he suie pe opinion, d. 
bake yp ker a aptaiba differed from that 
of this crew, I would have: (check one) a. A desire 
to maintain my own opinion, b. Hardly any desire 
to change my own opinion to agree with that of the 
crew, c. A moderate desire to change my opinion to 
agree with that of the crew, d. A strong desire to 
change my opinion to agree with that of the crew, 
e. A very strong desire to change my opinion to 
agree with that of the crew. 

In calculating the total score, the alterna- 
tives of each question were assigned a weight 
of one to four or one to five, in accordance 
with the number of options. The distribution 
of scores was divided into high, medium, and 
low categories in accordance with the con- 
siderations described initially. The low range 
included scores 10, 11, 12, and 13. The cut- 
off scores were dictated by the necessity of a 
workable number of cases in the middle range. 


Group Attraction 


The index of group attraction followed di- 
rectly from Festinger’s definition of group 
that is, the resultant of all 

ing on the members to remain in the 

group (2, p. 164). The term “group attrac- 
tion” was used rather than the more general 
term “cohesiveness” since the two questions 
involved in this index did not measure the 


forces acting to Prevent the individual from 
leaving the group. 


The index was derive 


d from the following 
questions: 


1. If one man fr 
main at the home base, rat 


. I'd feel very 
» c. It wouldn’t matter one 


The alternatives wer 
the items, 

In calculating the index 
the alternatives of ea 
from one to five, res; 
each individual grou 
by a simple summa 
sponses. The mean 
vided the desired ind 


e the same in each of 


of group attraction, 
ch item were weighted 
pectively. The score of 
p member was obtained 
tion of the weighted re- 
score of the group pro- 
ex. An index of the re- 


Robert C. Ziller 


Table 1 


Analysis of Variance of the Group Approach 
to the Problem and Group Score 


Group Average 
Approach N Score 
a 37 8.15 
b 20 10.82 
c 13 12.39 
d 8 14.13 
e 18 14.98 


Note.—F = 18.03; df = 4 and 92; p >01. 


liability of the measure was calculated using 
the split-half technique in which each crew 
was divided into two subgroups with approxi- 
mately the same number of officers and air- 
men. A Pearsonian correlation of .46 was 
obtained by correlating the average scores of 
the 96 pairs of subgroups. 


Group F. lexibility 


An operational definition of group flexibility 
was derived from the five methods of ap- 
proach to the task: (@) group members 
worked individually with little or no interac- 
tion; (b) group worked as a unit discussing 
each problem in turn; (c) group began work- 
ing as a unit (as in Method b) but, recog- 
nizing that there was insufficient time to com- 
plete the task by their initial method, changed 
to Method e; (d) group worked as a highly 
organized unit, limited the time spent on each 
problem to two minutes and appointed a E 
keeper and recorder; (e) group subdivide 
into two, three, four, or five sections an 
divided the problems accordingly. d 

The frequencies with which each metho! 
was employed are given in Table 1. ith 

While the study is concerned primarily wit 
process rather than product variables and the 
low reliability of the instrument notwithstand- 
ing, it was observed that the method of aP- 
proach placed definite limits on the score at- 
tained.* This observation was easily teste 
ae 


€ Correct answers to the items were provided oF 
the test writers. In cases where one answer sheet W = 
submitted by the group, the group scores were ZA 
tained from these single sheets by applying a simp f 
correction formula (R —}W - When members r- 

© same groups submitted answer sheets which ov 


lapped, the mean of the scores became the grou! 
Score, 


Communication Restraints 


through a simple analysis of variance between 
method and group score (see Table 1). It 
was also apparent that the mean score in- 
creased in direct relation to the method em- 
ployed. The highest scores were attained by 
groups which used the subgroup or highly- 
organized group approach. 

Groups using the approaches which re- 
flected an awareness of the need to reorganize 
to meet the time demands of the problem, 
Methods c, d, and e, were compared with 
groups using approaches @ and b. Reorgani- 
zation is used here in the sense that the 
aircrew was already organized as a highly 
integrated team with a single leader and was 
accustomed to performing as a single unit. 
Flexible crews were those who were able to 
Overcome this organizational set and reor- 
ganize in accordance with the time demands of 
the new situation. Actually however, two in- 
appropriate sets seemed to have been in- 
volved: the usual group-test-situation set in 
Which comparisons of answers are not per- 
mitted (Method a) and the set that crews 
Must perform as a single free-discussion unit 
Method b). 


Group Confidence 


The individual group members indicated 
r err degree of confidence in the group by re- 
Ponding to the following question: 


me ce well do you think your crew will do in this 

ee ‘em-solving situation? 

L Superior: better than 90% of the other crews. 
above average: better than 70% of the other 


a, crews, 
average: about as well as 50% 
crews, 
elow average: 
— Other crews. 
low: only as well as 10% of the other crews. 


aie alternatives of the question were 
Sighted from 1 to 5, with the lower score in- 

iting the greater confidence. The split- 

alf estimate of reliability was -55 (Ż = .001). 

iena e relationships among the three inde- 
] ent variables were calculated (Table 2). 

oder ations were low. Along with the 

ice ae reliability estimates of the meas- 
the ^n € evidence permits the conclusion that 
aia a ieni involves three relatively un- 
predictors. It was interesting to note, 


of the other 


S only as well as 30% of the 


349 


Table 2 


Intercorrelations Among the Three 
Independent Variables 


I Rigidity of the leader 
II Conformity of the leader 
III Group Attraction 


* Significant at the .01 level. 


however, that groups led by the more con- 
forming leaders tended to be more attractive. 


Results 


The data did not permit a pattern analysis 
involving the three independent variables sim- 
ultaneously as this would have generated cells 
without cases. Therefore, chi-square tests 
were calculated from the 2 x 2 tables formed 
by tabulating the frequency with which 
groups, dichotomized according to the inde- 
pendent variables, used the flexible or non- 
adaptive approaches to the task. 

The analysis of the data with regard to the 
relationship between the leader’s F-scale score 
and the criterion of group flexibility is pre- 
sented in Table 3. The differences with re- 
gard to the flexibility criterion were in the 
expected direction but statistically significant 
only at the 10 per cent level of confidence. 
Nevertheless, it was noted that 50 per cent of 
the Jow groups used the approaches which re- 
flect flexibility as compared with 31 per cent 
of the high groups. 

With regard to the association between 
leader conformity and group flexibility (Table 


Table 3 


The Relationship Between the Leaders’ F-Scale Scores 
(Dichotomized) and Group Flexibility 


Group Approach 


F Scale a b c d e 
High 22 it 4 3 8 
Low 15 8 9 4 10 


Chi square (a and b vs. ¢, d, and e) = 3.43; p = .10 


Note.—The total number of groups reported using each 
approach is not always consistent with the number in Table 1 


due to incomplete data with regard to two crews. 


350 
Table 4 


i i { ? Conformity 
lationship Between the Leaders’ y 
fa ini Indices and Group Flexibility 


Robert C. Ziller 


Table 6 


Analysis of Variance of Group Confidence According 
to Leaders’ F-Scale Scores and Group 
Attraction Indices 


ch 
Group Approa ae a ; j 
i df Estimate g 
Conformity a b c d e Variance if 
i 13 F Scale 1 16.62 188 

High 12 3 0 Tender a 6 

Medium a B 12 2 12 Group Attraction 1 149.56 16.90 Ae 

Low 14 4 1 4 3 Interaction 1 35.88 4.05 i 
Chi square (a and b vs. c, d, and e) = 6.33; p= 05 Groups 90 8.85 
Note.—The total number of 


groups reported using Approach 
d is not consistent with the number reported in Table 1 due to 
compete data. 

h 


High and low conformity frequencies were pooled throughout, 


4), it was seen that a greater percentage of 
groups in which the leader’s C-scale scores in- 
dicate moderate conformity used approaches 
reflecting greater flexibility. The approaches 
representing a high degree of flexibility were 
used by 52 per cent of the moderate groups 
but only by 27 per cent of the high-low 
groups. Thus, it was found that groups in 
which the leader was moderately concerned 
about the group’s opinions in comparison with 
those groups in which the leader was extremely 
concerned or unconcerned were more flexible 
in this problem-solving situation. 

The results with regard to the relationship 
between group attraction and group flexibility 
were not statistically significant and no trend 
was in evidence (see Table D) 

It was apparent, however, that the measure 
of group flexibility employed was not suf- 
ficiently sensitive; far too many groups were 
required for analysis. It is entirely possible, 
however, that groups in a setting other than 
the military would avoid the approach to the 
task in which the group acted merely as an 


Table 5 


The Relationship Between Group Attraction 
and Group Flexibility 


Group Approach 


Group 
Attraction a b c d e 
High BA Cae ems bee) 
Low 19 10 2 7 9 
Chi square (a and b vs. c,d, and e) = .29; p=.70 


Note.—When leaders’ F-scale scores were categorized as high 
and low, and Group Attraction scores were similarly ay ee 
the mean Group-Confidence indices for groups within tl ias ati 
gories were (lower scores indicate greater confidence 
group): 


Group Attraction 


F-stale Sore High Low Total 
High 1,80 2.19 2.00 
z (N =23) (N =25) 
3 
Lov 1.97 2.09 2.03 
i (N =25) (N =21) 
Total 1.89 2.14 2.01 


uncommunicative collection of individuals. 
In this event, the time required to shift to the 
subgroup approach to the task might serve as 
a superior measure of group flexibility. of 
course, it still remains to be shown to what 
extent flexibility, as defined in the present 
study, is generalizable. 


Table 7 


Analysis of Variance of Group Confidence According 
to Leaders’ Conformity Scores and 
Group Attraction Indices 


Source of Variance P t 
Variance df Estimate F 
Leader Conformity 2 23.88 2.66 ot 
Group Attraction 1 91.60 10.20 $ 
Interaction 2 2.91 
Groups 88 8.98 


_Note—When Leaders’ Conformity scores were categoriz 
high, medium, and low, 


dichotomized, ‘the mean 


ed aS 


EAT were 
and Group Attraction indices 


ups 
Group Confidence indices for 8t0 


D a 
within the categories were (lower scores indicate greater Ci 
dence in the group): 


Leader Conformity 

Group Total 
Attraction High Medium Low 89 

High 1.77 1.93 1.93 i 
(N =12) (N 530) (N26) F 

1 

Low 2.00 2.10 2.23 i: 
W =6) (N =20) (N = 20) i 

Total 1.85 2.00 2.16: 2S 


Communication Restraints 


Table 8 


Analysis of Variance of Group Confidence According 
to Leaders’ F-Scale and Conformity 
Scale Scores 


Source of Variance 

Variance df Estimate F $ 
Leader Conformity 2 56.52 6.03 05 
Leader F Scale 1 1.64 
Interaction 2 4.90 
Groups 88 9.37 


nipote When leaders’ conformity scores were categorized as 
Goma and low, and leaders’ ‘ale scores were dichoto- 
havent the mean Group Confidence indices for groups within 

categories were (lower scores indicate greater confidence in 


the group): 

Feader Leader Conformity 

F-Scale 

Score High Medium Low Total 

High 1.65 1.98 2.21 2.00 
w =s) (N =24) (N =16) 

Low 2.00 2.01 2.09 2.03} 
(v= 10) (N = 26) (NV = 10) 

Total 1.84 2.00 2.16 2.01 


In analyzing the data with regard to group 
Confidence, factorial designs for cells of un- 
(qual size were employed (11, P- 285). (In 
Tables 6, 7, and 8 the entries in each cell are 
the mean group confidence scores for the 
groups falling in the category.) The smaller 
Means indicate greater confidence. 

The results with regard to group attraction 
and leader conformity were statistically sig- 
nificant and indicate that members of the 
More attractive groups whose leaders claim 
that they tend to conform to the opinion of 
t € majority express greater confidence in 
tin, group’s ability to succeed in a new situa- 
ate main effects of leader F-scale scores 
vere not significant. However, when leader 
eer scores and group attraction indices 
Te studied conjointly, interaction effects 
ee re significant (see Table 6). The greatest 

nfidence was expressed by attractive groups, 

leaders of which had high F-scale scores. 
to oe group productivity was not central 
he study, nevertheless the relation between 
ag adependent variables and the group scores 
ance explored by means of analysis of vari- 
an pasigus similar to those in Tables 6, 7 
the ne p As might have been anticipated on 
) asis of the low reliability index of the 


351 


performance test, the results were not sta- 
tistically significant nor were there any ap- 
parent trends. 


Discussion 


In terms of the communication framework 
sketched at the outset, both the more flexible 
groups and the more confident groups were 
generally found to be those with the more 
open communication systems or communica- 
tion systems with relatively fewer restraints 
on the member’s interaction. Thus, the re- 
sults with reference to group flexibility sug- 
gest that groups with moderately conforming 
and Low F leaders adapted more readily to 
the requirements of the new situation. The 
more confident groups were those with highly 
conforming leaders and those to which the 
group was highly attractive for the members. 

The apparent inconsistency in the findings 
regarding the conformity of the F leader was 
anticipated and interpreted as a qualification 
rather than as a directly contradictory re- 
sult. That is, it was anticipated that while 
overly nonconforming leaders would tend to 
restrict communication among the members, 
the overly conforming leader, on the other 
hand, would tend to lack the confidence neces- 
sary to initiate group action. Thus, the char- 
acteristics of the leader which the group mem- 
bers find desirable—high conformity—may 
not be desirable with reference to other 
criteria—group flexibility. 

The relationship between the independent 
variables presumed to be related to open com- 
munication systems and subsequently found 
to be related to group confidence corroborates 
the results of earlier experiments involving 
communication networks. In these earlier 
studies, it was revealed that, in general, mem- 
bers of groups with more open communica- 
tion networks tend to have high group morale 
(9, 10). However, it should be emphasized 
that the open communication systems in the 
present study are presumed to be psycho- 
logically determined rather than mechanically 
determined as was the case in the earlier 
studies. 

With reference to the specific independent 
the main effects of leader F-scale 


variables, 
with regard to 


scores were not significant 


352 


ence. However, when leader F- 
oe group attraction indices were 
analyzed conjointly, interaction effects were 
significant (see Table 7). The greatest con- 
fidence was expressed in attractive groups, the 
leaders of which had high F-scale scores; 
while the least confidence was expressed in low 
attraction groups, the leaders of which had 
high F-scale scores. If high group attraction 
is interpreted as indicating homogeneity of the 
members’ F-scale scores, the results are in 
agreement with those presented in an earlier 

study relating homogeneity of the members’ 
F-scale scores to group morale (4). 

Contrary to a number of previous studies 
(1, 8, 12) the results reported above suggest 
a positive aspect of “authoritarianism.” Pre- 
sumably, the positive or negative aspects of 
authoritarianism with regard to the leader 
emerge only in conjunction with other leader 


and group characteristics and under particular 
task demands, 


Summary and Conclusions 


This study was designed to explore the re- 
lationship between selected group structure 
variables and the group’s ability to adjust 
to the requirements of a new situation (group 
flexibility) and the group members’ expressed 
confidence in the ability of the group to suc- 
ceed in a problem-solving situation, The in- 
dependent variables selected for analysis in- 
cluded the F-scale Scores and “conformity” 
of the assigned leader and group attraction. 

The Ss were 96 aircrews comprising about 
1,000 men. The group task required the com- 
pletion of an eight-item intelligence examina- 
tion in a period of time which demanded a 
change in the group's customary operating 
procedure. 

In the more flexible groups the leaders 
scored low on the F-scale and moderately high 
on a scale of conformity. It was also found 
that greater confidence in the group was ex- 
pressed by members of high attraction groups 
and groups whose leaders tended to conform 
to the group members’ opinions. In addition, 


Robert C. Ziller 


interaction effects of leader F-scale scores and 
group attraction with regard to group con- 
fidence were statistically significant. It was 
concluded that groups with more open com- 
munication systems (group with fewer com- 


munication restraints) are more flexible and 
more confident. 


Received January 27, 1958. 


References 


1. Adorno, T. W., Frenkel-Brunswik, Else, Levinson, 
D. J., & Sanford, R. N. The authoritarian 
personality. New York: Harpers, 1950. 2 

- Festinger, L., Schacter, S, & Back, K. Social 
pressures in informal groups: A study of a 
housing project. New York: Harpers, 1950. 

- Hartley, E. L., & Hartley, R, E. Fundamentals 
of social psychology, New York: Knopf, 1952. 

- Haythorn, W. The effects of varying combina- 
tions of authoritarian and equalitarian leaders 
and followers. J. abnorm. soc. Psychol., 1956, 
53, 210-219, A 

5. Hemphill, J. K. Situation factors in Jeaierstin, 
Ohio State Univer.: Bureau of Educationa 
Research Monographs, 32, 1950. sa anit 

- Kelley, H. H., & Shapiro, M. M, An experimen! 
on conformity to group norms when con- 
formity is detrimental to group achievement. 
Amer. soc. Rev., 1954, 19, 667-677. 1 

7. Kelley, H. H, & Thibaut, J, W, Experimenta 

studies of group problem-solving and proces 
In G. Lindzey (Ed.), Handbook of socia 
psychology. Cambridge, Mass.: Addison-Wes- 
ley, 1954. 

- Rokeach, M. Generalized mental rigidity as # 
factor in ethnocentrism, J, abnorm. $06- 
Psychol., 1948, 43, 259-278. í 

9. Shaw, M. E. A comparison of two types ° 

leadership in various communication nets. 
abnorm, soc. Psychol., 1955, 50, 127-134. ji 
aw, M. E. & Rothschild, G. H. Some sie 
of prolonged experience in communicatio’ 

nets. J. appl. Psychol., 1956, 40, 281-286. d) 

11. Snedecor, G, W. Statistical methods. (4th ed- 

Ames, Iowa: Iowa State College Press, 1946- y 
12. Stern, G. G., Bloom, B. S., & Stein, M. L cy 
sessment of personality II: Studies of the Si 
lationship between personality syndromes 
learning situations, and learning outcomes: 
Chicago: Univer, Chicago, 1952, t 

13. Torrance, E. P, Crew performance in a pesi 

situation as a predictor of field and comba 


w 


10. Sh; 


effectiveness, Washington, D. C.: Hume? 
Factors Operations Research Laboratories 
1953. 


(HFORL Report No. 33.) 


J 


Journal of Applied P: 
Vol. 42, wore, ied Psychology 


+ The Comparative Effectiveness of Some Psychological and Physiological 
Measures in Ranking the Impact of Diverse Environmental 
Conditions * 


Bernard J. Fine 


QM Research and Engineering Center 


Tn the course of evaluating the thermal pro- 
tection qualities of military clothing, it is fre- 
td necessary to measure differences in 
l e responses of soldiers along several physio- 
logical dimensions. These measurements are 
joo taken, of necessity, under controlled 
oe and are not readily adaptable to 
Gn UES Furthermore, they are both time 
is en and expensive since they involve 
ea use of fairly elaborate equipment both in 
d aining the measures and in processing the 
ata. 

t e present study investigates, in two parts, 
siue of subjective rating scales as sub- 
me es for the more complex physiological 
asurements. 
one subjective aspects of man’s thermal en- 
et os have had little systematic study. 
E etl work that has been done is psy- 

i ees in nature and involves, primarily, 
linan FE oe of thresholds of temperature 
ivan (4, 6, 7, 8). Several studies have 

ity igated the relationships between person- 
= heise and subjective responses to 

a factors (1, 3, 10, 11). 

e p ave also been some attempts to get at 
subjes e ological and physiological bases of 
M: ive reports of warmth or coldness (1, 
been be 1). Insofar as is known, there have 
bhysiolo studies comparing psychological with 
Ness j gical measurements in their effective- 
Viton, ranking a number of different en- 
individ a according to the thermal state of 
compari’ in those environments. re 
Sctibeq ae reported in the research de 


Th 
operate thor wishes to thank P. Jampietro for his 
ile and fo in making the physiological data avail- 
Ogica] 9 OY his helpfulness with regard to the physio- 
aspects of the study. 
353 


Laboratories, Natick, Massachusetts 


Study Number One 


Method 


Subjects. The subjects (Ss) were six enlisted men 
stationed at the Quartermaster Research & Engineer- 
ing Command as volunteer test subjects. They are 
described in greater detail in a paper by Iampietro 
et al. (9). They were considered to be “normal” on 
the basis of psychological and physiological measure- 
ments. 

Procedure. The six Ss, clad only in shorts, were 
exposed as a group to different environmental condi- 
tions on eight successive days (excluding one week- 
end). The conditions represented combinations of 
ambient temperatures of 50 and 60 degrees F. with 
humidities of 30 and 95 per cent and wind speeds of 
zero and 10 miles per hour. The order in which the 
conditions were presented was randomly determined. 
Each exposure lasted two hours each day. A one- 
hour “control” period preceded each exposure and a 
one-half hour “recovery” period followed each ex- 
posure. Physiological and psychological measures 
were taken during the control, exposure, and recov- 
ery periods. 


The physiological measures which are considered 


in this report are mean weighted skin temperature 
(MWST) and increase in metabolic rate (AMR). 
These are discussed fully in the paper by Iampietro 
et al. (9) as is the methodology in general. Briefly, 
MWST represents an average skin temperature based 
on readings obtained from various parts of the body. 
Metabolic rate increase, as used herein, refers to the 
average increase in metabolism (cal./hr.), during ex- 
posure, over the reading obtained in the control 
period, which js an approximate basal reading. The 
average increase in MR is based on four MR meas- 
ures taken during the exposure period. 

The psychological measure consisted of a paper- 
and-pencil subjective rating scale on which the S 
indicated how various parts of his body (head, 
chest, back, arms, hands, legs, and feet) felt on a six- 
point scale ranging from “yery hot” to “very cold.” 
Each S indicated how the body parts felt at half- 
hour intervals during the control, exposure and re- 
covery periods. Only the four exposure ratings are 


considered in this report. 


Results 
Agreement between subjects—skin tempera- 
ture. A MWST reading of each S was ob- 


354 


tained at the end of each two-hour exposure 
period. The eight conditions were ranked 
from warmest to coldest for each S on the 
basis of his MWST’s. A Kendall coefficient 
of concordance (12) calculated from the rank- 
ings for the six Ss yielded a W of .91 which 
is significant at better than the .01 level. 
This indicates that there is a high degree of 
agreement between the six sets of rankings, 
Agreement between subjects—metabolic 
rate increase. An average AMR for each S 
during each two-hour exposure period was 
obtained. The eight conditions were ranked 
from warmest to coldest for each S$ on the 
basis of the AMRs. A coefficient of concord- 
ance calculated from the six sets of rankings 
yielded a W of .83 which is significant at 
better than the .01 level indicating a high 
amount of agreement between the six sets of 
rankings based on MR, 
Agreement between subjects—sub jective 
rating scale. Numbers were assigned to the 


exposure administrations of the scale were 
averaged. An average subjective response for 
each S for each experimental condition was 
thus obtained.? On the basis of these aver- 
ages, it was possible to rank the eight condi- 
tions for each S in terms of how war 
they made him feel, 


rankings based on subjective ratings. 
Comparison of subjective and physiologi- 
cal measures. Having shown that there is a 
high degree of agreement between Ss in rank- 
ing the eight conditions according to their 
subjective and physiological Tesponses, it was 
then permissible to determine the extent of 
agreement between the subjective and physio- 
logical methods of ranking the Conditions, 
To do this, the individual responses were 
averaged so that a mean group MWST, AMR, 
and subjective rating was obtained for each 
of the eight conditions, The eight conditions 
2 The numbers were assigned to the words strictly 


as a convenience in ordering the data. No assump- 


tion was made regarding the equality of the intervals 
of the subjective scale, 


Bernard J. Fine 


Table 1 


Comparison of Rankings of Conditions by MWST, 
AMR, and Subjective Methods* 


MWST AMR Subjective 
Conditions" Method Method Method 
50/30/0 3 (79.8) 5 (85) 5 (1.83) 
60/95/0 1 (83.8) 1 (18) 1 (2.73) 
50/30/10 7 (69.8) 8 (171) 8 (1.00) 
60/30/0 2 (81.3) 2 (38) 2 (2.62) 
60/95/10 6 (74.0) 6 (89) 6 (1.73) 
50/95/0 4 (79.4) 3 (39) 3 (2.10) 
50/95/10 8 (69.4) 7 (119) 7 (1.62) 
60/30/10 5 (74.8) 4 (74) 4 (1.85) 


* The numbers in Parentheses refer to the 
which the ranking is based. MV 
AMR averages are in cal./ ja 
based on the 1 through 6 ratings ive scale. 5 

b In a condition designation, e.g., 50/30/0, the first figures 
refer to ambient temperature, the sı 
humidity and the third to wind s; 

e Conditions 
sented, 


verage upon 
n degrees F.; 
averages are 


were then ranked from “warmest” to “cold- 
est” according to the average measurements. 
Thus, three sets of rankings of the conditions 


were obtained; one based on MWST, one on 
AMR, and on 


gree of agreement between the subjective and 
i ranking the condi- 
The group averages and rankings for 
the MWST, AMR, and subjective rating 
methods are shown in Table 1. The rank 
order correlation between the subjective rat- 
ings and MWST yielded an ¢, of .9§ and that 
between subjective ratings and AMR foe 
fs Of 1.00. These Correlations indicate a high 
degree of agreement between the subjective 
tating scale and each of the physiological 


Measures jn ranking the conditions from 
Warmest to coldest, 


Study Number Two 
Method 


Subjects. The Ss Were six enlisted men stationed at 
the Quartermaster Research & Engineering Comman 
as volunteer test Subjects. One § had participate 
in Study Number One. 


Procedure. The Procedure was exactly the same 
as described in Study N; 


< 


Psychological and Physiological Measures 


Table 2 


Comparison of Rankings of Conditions by MWST, 
AMR, and Subjective Methods 


Re: MWST AMR Subjective 
onditions Method Method Method 
50/30/10 5 (75.6) 6 (81) 6 (2.25) 
40/95/10 7 (72.0) 8 (101) 8 (1.80) 
40/95/0 3 (79.5) 3 (51) 3 (2.83) 
50/95/10 6 (74.9) 5 (79) 5 (2.50) 
Tony 4 (78.8) 4 (59) 4 (2.65) 

0/95/0 1 (81.7) 1 (27) 1 (3.08) 
50/30/0 2 (80.8) 2 (34) 2 (2.87) 
40/30/10 8 (71.4) 7 (93) 7 (2.05) 
ze e — 
Results 


Agreement between subjects. As in Study 
k oe One, coefficients of concordance were 
alculated from the rankings based on MWST, 
86 and subjective ratings. These were 
si » 75, and .85, respectively. All three are 
a cant at better than the .01 level and 
Icate a high amount of agreement between 
Ann sets of rankings with respect to MWST, 
R, and subjective ratings. 
aa of subjective and physiological 
Wa Sures. The data were treated the same 
Bari as in Study Number One. The com- 
age between the MWST, AMR, and sub- 
Ve rating methods are shown in Table 2. 
ie order correlation between subjective 
th 8s and MWST yielded an r, of .93 and 
ta of etween subjective ratings and AMR an 
chem As in Study Number One, the cor- 
ety US indicate a high degree of agreement 
of i een the subjective rating scale and each 
condi: Physiological measures in ranking the 
itions from warmest to coldest. 


Discussion 


it 5 results of both studies indicate that if 
ition essary to compare environmental con- 
amth and/or order them according to the 
th s ws coldness of groups exposed to them, 
fici nr bjective rating scale method is as ef- 
Methoq as either of the two physiological 
simp] S. The subjective method is also 
er to administer and analyze for scoring. 


355 


It should be noted that when actual physio- 
logical values are required, the subjective 
scales are relatively inaccurate in their ap- 
proximations and cannot replace the physio- 
logical measurements. 

On the other hand, when an assessment of 
Ss’ thermal “feelings” is desired, for other 
than comparative purposes, the appropriate 
measure is the subjective one. One cannot 
accurately infer actual subjective comfort 
from physiological measurements. 

While the two studies did not investigate 
the comparison of physiological and psycho- 
logical responses of groups wearing different 
clothing in a constant environment, the re- 
sults indicate that the subjective method 
should be as efficient as the physiological 
methods in such instances for comparative 
purposes so long as MWST’s are not expected 
to drop below approximately 65 degrees F. 
There is some evidence that below certain 
MWST’s, the subjective scale loses some of its 
accuracy.’ The precise point at which the 
subjective scale used herein becomes inac- 
curate has not yet been established. 


Summary 


Two studies are presented, both of which 
compare the effectiveness of a subjective rat- 
ing scale with two physiological measures, 
mean weighted skin temperature and average 
increase in metabolic rate, in ranking eight 
environmental conditions (varying in ambient 
temperature, humidity and wind speed) from 
warmest to coldest. 

The results of both studies indicate a high 
degree of consistency between individual re- 
sponses on each measure and a high degree 
of agreement between the subjective rating 
method and both mean weighted skin tem- 
perature and metabolic rate methods of rank- 
ing the conditions. 

Tt is noted that the subjective scale might 
be used instead of the physiological measure- 
ments for the purpose of comparing condi- 
tions with regard to the relative warmth or 
coldness of groups of individuals within the 
conditions, but only for that purpose. 


Received January 29, 1958. 
8 John McGinnis, Personal communication. 1957. 


356 


References 


1. Blair, J. R., Urbush, F. W., & Reed, I. T. Pre- 
liminary observations on physiological, nutri- 
tional, and psychological problems in extreme 
cold. Fort Churchill, Canada. Proj. No. 57-3, 
Field Res. Lab., Ft. Knox, Ky., 1947. 

2. Chagas, C., Mortara, G., & Borges Sampaio, F. 
Alguns estados sobre e sensibilidade termica. 
An. Acad. Brazil Cienc., 1947, 19(1), 71-102. 

3. Debons, A. Survey of human adjustment prob- 
lems in the northern latitudes. Personality 
predispositions of infantrymen as related to 

their motivation to endure tour in Alaska. A 
comparative evaluation. Arctic Aeromed. Lab., 
Ladd AFB, Alaska, 1950. (Proj. Rep. No. 
21-01-022.) 

4. Ebaugh, F. G., Jr., & Thauer, R. Influence of 
various environmental stimuli on cold and 
warmth thresholds. J. appl. Physiol., 1950, 
3, 173. 

5. Freedman, A., & Horvath, S. M. Cold weather 
operations. Test of the adequacy and range 
of use of winter clothing. Study of methods 
for selection of men for cold weather opera- 
tions. Proj. No. 1-1: 1-18, Armored Med. 
Res. Lab., Fort Knox, Ky., 1944. 


Bernard J. Fine 


6. Hardy, J. D., Goodell, H., & Wolff, H. G. In- 
fluence of skin temperature upon pain thresh- 
old as evoked by thermal radiation. Science, 
1951, 114, 149-150. 

7. Hendler, E., & Hardy, J. D. Temperature sensa- 
tions accompanying changes in skin tempera- 
ture. NAMC-ACEL-350, Proj. No. NM 
1701132, Naval Air Mat. Ctr., Phila., 1957. 

8. Herget, C. M., Granath, L. P., & Hardy, J. D. 
Thermal sensation and discrimination in rela- 
tion to intensity of stimulus. Am, J. Physiol., 
1941, 134, 645, 

9. Iampietro, P. F., Bass, D. E., & Buskirk, E. R. 
Heat exchanges of nude men in the cold: 
Effect of humidity, temperature and wind 
speed. J. appl. Physiol., in press. 

10. McCollum, E. L. Personality alteration during 
reduced caloric intake under survival condi- 
tions in the sub-Arctic. Arctic Aeromed, Lab., 
Ladd AFB, Alaska, 1950. (Proj. Rep., No. 
21-01-025.) 

11. Patrick, J. R, & Rowles, E. Intercorrelations 
between metabolic rate, vital capacity, blood 
pressure, intelligence, scholarship, personality 
and other measures on university women. J. 
appl. Psychol., 1933, 17, 507-521. 

12. Siegel, S. Nonparametric statistics, New York: 
McGraw-Hill, 1956, 


Journal of Applied Psychology 


VoL. 42, No. 6 


DECEMBER, 1958 


cenit investigation of “point” job evalu- 
fhe isi dates back to Lawshe, who, dur- 
Satit middle 1940's, undertook a series of 
en i studies aimed at uncovering the 
After actors operating in systems of this type. 
fic ay studies (5, 6, 8, 9, 10) involving 
eia actor analyses, his results were charac- 
foe by one general factor which accounted 
ih ere 77 and 99 per cent of the variance 
{with al job value and which correlated highly 
ment the rating on nearly every job require- 
- He called this factor “Skill Demands.” 
vest sequent factor analyses by other in- 
act <p (1, 3, 4, 11) have confirmed the 
might at although a number of basic factors 
in mo, emerge in a particular factor analysis, 
accou sA studies one out of the number would 
Vatian for upwards of 80 per cent of the 

ce in total points, or job value. 
and cS itignal study by Lawshe, Dudek, 
which ilson (7), however, produced results 
othe were quite different from those of all 
jobs., studies. In this particular study, 40 
An rated under two job evaluation sys- 
k one with 11 items, the other with four. 
ae analysis of both systems produced five 
Studi factors for each. In contrast to other 
of the? however, in the 11-item system, four 
% e€ five factors each accounted for between 
whit, and 37% of the total point variance, 
fac è in the four-item system each of the five 


act, 
the ors accounted for between 4% and 537% of 


Ta point variance. : 
in mao onal experience with job evaluation 
tty, the author had observed that 


hi 
Quire, 1S work i l t of re- 

M was done as partial fulfillment © 
Southactts for a Ph.D. HA at the University of 


Uthe 
A alifornia, 1956. 


| 


357 


An Experimental Investigation of “Point” Job Evaluation 
Systems * 


James H. Myers 


Prudential Insurance Company 


evaluations within a given company might be 
subject to manipulation or “forcing” by evalu- 
ators in order to produce a desired total 
evaluation or pay range for a job. The less 
accurate the system in operation, the greater 
the need for forcing of job ratings. It was 
felt that differences in results between the 
particular study (7) mentioned above and 
other studies were due in some measure to the 
fact that evaluations in the former were not 
forced (since they were made by relatively 
disinterested evaluators in a number of dif- 
ferent companies who would have had no in- 
centive to force ratings), while those in other 
studies probably were forced. 

The present study was undertaken to de- 
termine the extent to which the forcing of job 
ratings could influence results to be obtained 
from statistical investigations of point job 


evaluation systems. 


Method 


In order to determine the effects forcing can have, 
it was necessary to create an experimental situation 
wherein forcing could be made to occur under con- 
trolled conditions, so that comparisons could be made 
between forced and unforced evaluations. Accord- 
ingly, a sample of jobs were first evaluated under 
conditions designed to greatly reduce or eliminate the 
possibility of forcing. Then the ratings of the same 
jobs were “forced” to predetermined standards, as de- 
scribed below. It was then possible to note differ- 
ences between forced and unforced evaluations in 
terms of factor structure and factor loadings, to de- 
termine the effects of forcing. 

This procedure had the further advantage of hold- 
ing constant the effects of “halo” in the job ratings 
(halo might be thought of as an unintentional bias, 
while forcing would be more in the nature of inten- 
tional). Both forced and unforced evaluations un- 


358 


doubtedly contained halo,? but under the conditions 
of the experiment, differences in the two sets of p 
ings should have been mainly due to the effects o 
ing alone. re 
Oana jobs, and raters. Although a point job 
evaluation system was already in operation in the 
Prudential Insurance Company, where this study was 
done, a new evaluation system was developed for the 
study. The use of the system then in operation 
might have encouraged conscious or subconscious ad- 
justments of ratings in order to conform to existing 
official evaluations of the jobs studied. This would 
have violated the basic design of the study. 

A new job evaluation system was developed, con- 
sisting of 172 job characteristics or requirements 
(e.g, mental requirements, experience requirements, 
physical demand, etc.). Five grades, from most to 
least, were written for each factor. 

The new system was applied to a sample of 82 
jobs in operation at the Prudential’s Western Home 
Office, Los Angeles, California, Jobs were selected 
so as to adequately 
work performed by 
company. Three company employees, 
point job evaluation techniques, rated th 
each of the 17 job requirements, 
under two conditions, 

First condition (unforced) ratings. 
structed to evaluate 
Separately, without regard to any t 
evaluation which might result, 


e 82 jobs on 
Ratings were done 


‘otal or over-all 
Letters rather than 
he descriptive grades 
s der to eliminate the 
various requirements 


could be summed to produce a total score for each 


ations were assumed to be 


as possible, and were denoted 
as “first condition” evaluations, 
pains Bexi 


2 It is the writer’s opinion that th 
were great in the evaluations of thi 
were experienced in point job evaly 
(to the writer) unable to retreat f. 
of “over-all job value” i 


e effects of halo 
S study. Raters 
ation and seemed 
at far from the concept 
in assigning unforced evalua- 


sible. 
3 It was not necessary for purposes of this study 

to have as many as 17 requirements 

tem. However, a secondary purpos 

study was to determine by factor 


James H. Myers 


Second condition (forced) ratings. Arbitrary 
weights for each job requirement were developed by 
the raters and applied to the above first condition 
evaluations. This produced a total point score for 
each job. Raters were then informed of the actual 
level (actual pay grade, as determined by the evalua- 
tion system already in operation at the time of the 
study) of each job, furnished with a total-point-to- 
level conversion scale (a scale whereby point values 
are transformed into job levels, or pay grades), and 
asked to “adjust” first condition evaluations, where 
necessary, so that the final total point score for each 
job would conform to the actual job level on the 
conversion scale, They were instructed to make 
whatever changes seemed “most reasonable” in in- 
dividual job requirement ratings, even though job- 
to-job comparisons might be distorted. 

Raters were not informed of the reasoning or pur- 
pose behind second condition evaluations. They were 
told that the adjustments were necessary in order to 
provide certain additional statistical data which were 
needed. The resulting “forced” ratings were denoted 
as “second condition” evaluations. 

Both first and second condition evaluations were 
checked for reliability, intercorrelated, and factor 
analyzed, using Thurstone’s complete centroid method 
(12). Rotations were orthogonal to the strict on- 
terion of simple structure (to avoid subjectivity 17 
rotation as much as possible). Effects of adjusting 
or forcing evaluations were noted in terms 9° 
changes in rotated factor structure and factor load- 
ings from first to second condition evaluations. 
particular interest were changes in loadings of job 
level on the factors which emerged, since all adjust- 
ments were made in relation to this value. 


Results 


Factor structure. Five factors emerged 
under both first and second conditions, aS 
shown in Tables 1 and 2. The three pier 
dominant factors under the first condition 
corresponded closely with the three predom- 
nant factors under the second condition, The 
remaining two factors were somewhat differ- 
ent, possibly enough to alter factor interpreta- 
tions. However, neither factor was sufficienti 
Well defined to permit any definite conclusions: 

From first Condition loadings, the factors 
were designated as follows: 

(A) Over-all Value: High loadings 0” 
nearly all job requirements, High correlatio" 
with total points, or job level, EPS 

(B) Supervision: High loadings on requir® 
ments involving direction of subordinates. | 

(C) Physical Components: High loading 
on requirements dealing with physical dema” 


€ 


bul 


sc, Working conditions. High loadings on 
Cation requirement also, but this require- 
ould have been split off if rotational 
a had included psychological meaning- 


m 


“Point” Job Evaluation Studies 359 


ent w 
criteri 


- Mental Requirements 
. Frequency of Decisions 


Table 1 
Rotated Factor Loadings—First Condition 
A B C D E h? 
94 09 06 —14 —04 92 
94 —04 ' 05 —05 —11 90 
96 01 08 —03 02 93 
66 02 12 —35 —05 58 


- Constant Attention to Details 
- Education 

6. Experience 

7. Effect of Inaccurate Work 

8. Review on Work 

9. Persuasion 


1 

2 

3. Difficulty of Judgment 
4 

5 


10. Importance of Contacts to Company 


11. Frequency of Contacts 

12. Variety in Work 

13. Confidential Nature of Work 
14. Working Conditions 

15. Physical Demand 

16. Number Supervised 

17. Difficulty of Supervision Given 
18. Job Level 


30 =12 59 04 13 47 
81 16 —05 07 —09 70 
91 02 08 —12 17 88 
69 08 -13 33 —10 62 
82 -35 02 20 =15 86 
85 —43 —04 25 10 98 
66 —62 07 30 10 92 


87 04 —02 12 —10 78 
40 —20 —13 02 27 29 
01 —39 55 —01 —16 48 
—31 —32 57 03 -13 54 


12 86 08 49 —01 100 
—04 85 04 50 04 98 
93 05 20 25 11 98 


N a z 
Ote.—Decimal points have been omitted. 


(D) Independent Action: Freedom or lati- 
tude in approaching job duties. Characterizes 
work which may be done independently, in the 
sense that a supervisor has latitude in how he 


u 
ness, handles group, etc. 
Table 2 
Ee Rotated Factor Loadings—Second Condition 
A B c D E a 
1. Mental Requirements 89 02 02 E- a = 
2. Frequency of Decisions 95 0: =02 Ee = 92 
3. Difficulty of Judgment 94 -Ol -06 —17 . Z 
4. Constant Attention to Details 68 -03 a oa 
5. Education aS. eae =. = 
6. Experience 80 Oo =14 -05 2 X 
T. Effect of Inaccurate Work T “3 - -a i n 
8. Review on Work 12 o a = E 4 
9. Persuasion Bo) esas 5 P E 
10. Importance of Contacts to Company 84 —45 "a A = = 
11, Frequency of Contacts i =H = 00 -07 76 
12. Variety in Work a6 p re 02 32 37 
13, Confidential Nature of Work 43 —26 mae 08 19 46 
14. Working Conditions ii -23 at 08 19 58 
15. Physical Demand “at a a 26 06 100 
e Number Supervised = ia = 29 12 95 
a Di Wy T, . AJ = 
n ificulty of Supervision Given 99 10 05 04 10 100 


18 J 
~~ - Job Level 
we 


*—~Deci; z z 
Scimal points have been omitted. 


360 


Table 3 


Job Requirements Showing Greatest Changes in Factor Loadings from First to 
Second Conditions for Each Factor 


James H. Myers 


Requirement Showing 


Factor Loadings 


Factor Greatest Change Ist Cond. 2nd Cond. 
Over-all Value Education 30 e 
Supervision Physical Demand —.32 =a 
Physical Components Difi. of Judg. 08 R 
Independent Action No. Supervised 49 a 
Confidential Nature Education 13 —. 


(E) Confidential Nature: Confidential na- 
ture of material normally handled in perform- 
ing job duties. 


Contrary to expectations, forced as well as 
unforced evaluations yielded factor structures 
which were similar to those found in the 
majority of previous studies; that is, one 
principal factor (Over-all Value) showed high 
loadings on nearly all job requirements and 
explained nearly all the variance of job level. 

Factor loadings. In spite of the over-all 
similarity between first and second condition 
factor structures, factor loadings of job re- 
quirements showed some fairly substantial 
differences. These differences (between load- 
ings of a job requirement on the same factor 
from first to second conditions) were greatest 
in the less well-defined factors which emerged. 

Table 3 indicates the job requirements 
which showed the greatest changes in loadings 
on each factor from first to second conditions 
(from Tables 1 and 2). 

Thus, for the Over-all Value factor, the 
“Education” requirement showed the greatest 
change in factor loadings from first to second 
conditions, increasing from .30 to 
tively. 

Forcing also changed the job level vari- 
ance explained by some of the factors which 
emerged. It can be seen from Tables 1 and 2 
that forcing ratings to conform to job level 
increased from .93 to .99 the correlation be- 
tween job level and Over-all Value, the prin- 
cipal factor which emerged. This had the 
effect of increasing the job level variance ex- 
plained by this factor from 86% under the 
first condition to 98% under the second. This 


-43, respec- 


BO J 


$ 


_— 


increase was offset by decreases in job level 
variance explained by two other factors, Physi 
cal Components and Independent Action. PA 
centages of job level variance explained H 
the two remaining factors remained relatively 
constant. 


Discussion 


It should be noted that the changes from 
first to second conditions found in this stu 7 
were not the same as the types of chena 
which occur in the more usual experimen e- 
situation. In the latter, separate measta 
ments are taken before and after the int? 5 
duction of instructions or conditions, tnei 
fects of which are being measured. In es 
present study, the measurements themselv o 
which were obtained under the first condit! 
were changed. » are 

Differences between forcing and “halo fe 
important to consider in this study. al a nce 
lieved that forcing represents an influe j 
over and above that of halo. In the latta 2 
might be assumed that the rater is unint al 
tionally influenced in his evaluations 2 - 
over-all impression of job worth. The spat 
tion would be structured in such a way t7, 
he would have no ulterior motive for PY° 
ing a particular final or over-all rating, 2” a 
would be likely to do his best to rate 
curately. J 

In the case of forcing, however, there wore 
be a more conscious desire to produce 4 Fle 
ticular evaluation for a reason. For examin 
in evaluating jobs the proper organizat 3 if 
“fit” for a job could be achieved by fora a 
initial evaluation attempts did not Pê 
job in “proper” relation to others. 


e 


| 
, 


H 


2 


“Point” Job Evaluation Studies 


First condition (unforced) evaluations were 
undoubtedly influenced to some extent by 
halo, and this influence would have been 
carried directly over into second condition 
evaluations. However, only second condition 
evaluations should have been influenced by 
Conscious forcing of the type described above. 
The change from first to second conditions 
should have resulted solely from forcing, and 
It was this phenomenon that was under con- 
sideration in this study. 

It is impossible to tell from this study alone 
the maximum effects that forcing can produce 
on a job evaluation system. It is presumed 
that the effects are at least as great, if not 
eee in many job evaluation systems in 
ae operation. This study suggests that an 
ao a about to undertake a statistical 
Ria y of job evaluation systems should be 
a that forcing may have influenced evalu- 
homed already available in a plant or office, 
“nd he should plan his investigation accord- 
ingly, 

Summary 


rata hty -two jobs were evaluated by three 
istics, on 17 job requirements or character- 
dition Evaluations were done under two con- 
1 T unforced and forced. Findings were: 
and f ive factors emerged in both unforced 
nant z Ted evaluations. The three predomi- 
ing ty actors in each were similar, the remain- 
WO somewhat changed. 
aia forced and unforced evaluations 
ost factor structures similar to those in 
Xp] Previous studies; i.e., one over-all factor 
aining most of the variance of job level. 
>: Forcing had the effect of increasing the 
factor f l variance explained by the principal 
in foroo 86% in unforced ratings to 98% 
Tced ratings, Factor loadings of some 


c 


361 


individual job requirements were also rather 
markedly affected by forcing. 


Received March 17, 1958. 


References 


1. Ash, `P. A statistical analysis of the Navy’s 
method of position evaluation. Publ. Personn. 
Rev., 1950, 11, 130-138. 

2. Ebel, R. L. Estimation of the reliability of rat- 

ings. Psychometrika, 1951, 16, 407-424. 

. Grant, D. L. An analysis of a point rating job 
evaluation plan. J. appl. Psychol., 1951, 35, 
236-240. 

4, Howard, A. H., & Schultz, H. G. A factor analy- 
sis of a salary job evaluation plan. J. appl. 
Psychol., 1952, 36, 243-246. 

. Lawshe, C. H., Jr. Studies in job evaluation: II. 
The adequacy of abbreviated point ratings for 
hourly paid jobs in three industrial plants. J. 
appl. Psychol., 1945, 29, 177-184. 

6. Lawshe, C. H. Jr., & Alessi, S. L. Studies in job 
evaluation, IV. An analysis of another point 
rating scale for hourly paid jobs and the 
adequacy of an abbreviated scale. J. appl. 
Psychol, 1946, 30, 310-319. 

. Lawshe, C. H., Jr., Dudek, E. E., & Wilson, R. F. 
Studies in job evaluation. 7. A factor analysis 
of two point rating methods of job evaluation. 
J. appl. Psychol, 1948, 32, 118-129. 

8. Lawshe, C. H., Jr, & Maleski, A. A. Studies in 

job evaluation. 3. An analysis of point ratings 
for salary-paid jobs in an industrial plant. J. 
appl. Psychol., 1946, 30, 117-128. 
9. Lawshe, C. H., Jr, & Satter, G. A. Studies in 
job evaluation. I. Factor analysis of point 
ratings for hourly-paid jobs in three industrial 
plants. J. appl. Psychol, 1944, 28, 189-198. 
10, Lawshe, C. H., Jr, & Wilson, R. F. Studies in 
job evaluation. 5. Analysis of the factor com- 
parison system as it functions in a paper mill. 
J. appl. Psychol., 1946, 30, 426-434. 

11. Rogers, R. C. Analysis of two point-rating job 
evaluation plans. J. appl. Psychol, 1946, 30, 


579-585. ; 
12. Thurstone, L. L. Multiple-factor analysis; a de- 
xpansion of The Vectors of 


velopment and e: i 
Mind. Chicago: Univer. Chicago Press, 1947. 


we 


uw 


= 


al of Applied Psychology 
ya, No. 6, 1958 


Judgments of Speed on the Open Highway ' 


Abram M. Barch ° 


Michigan State University 


Certain adaptation effects are often reported 
with respect to the perception of the rate of 
speed at which one is moving while riding in 
an automobile. After riding at a relatively 
constant speed for a period of time, this speed 
does not appear as fast as it did at the begin- 
ning. Furthermore, travel at a rate of speed 
below this previous speed may seem extremely 
slow. 

Despite the common agreement on the ex- 
istence of speed adaptation effects and the 
long-standing interest of psychologists in the 
perception of motion, the conditions contribut- 
ing to such adaptation or the behavioral ef- 
fects of such alteration in phenomenal velocity 
have not been studied. Even the question of 
how reliably speeds of an automobile can be 
judged has received little attention (2, 5). 

Speed adaptation has been suggested as a 
contributing factor in traffic accidents, espe- 
cially those at the end of long tangent sections 
of roadway (3, p. 24). This notion has plau- 
sibility if it can be assumed that the effect of 
speed adaptation is to cause drivers to main- 
tain higher levels of speed than they would 
otherwise in situations where a lower speed is 
conducive to safety (e.g., curves, turn-off 
lanes, signalled intersections on rural high- 
ways). 

The present study was an exploratory one 
with two objectives: (a) to determine the ac- 
curacy with which judgments of speed could 
be made by the driver of a passenger car while 
decelerating; and (b) to determine the influ- 
ence of increasing amounts of exposure to a 
given speed on these judgments. 


1 This study was conducted und 
ment with the Department of P. 
Highway Traffic Safety Center 
University with funds and faciliti 
Center. 

2 Acknowledgment is made of the assistance of 
Wayne Chubb, Peter Hemingway, John Nangle, and 


Thomas Trabasso in carrying out the experimenta- 
tion and data analysis, 


ler a joint appoint- 
sychology and the 
of Michigan State 
es provided by the 


Experiment I 
Method 


Subjects. The Ss were 44 male volunteers enrolled 
in driver education courses at Michigan State T 
versity, Summer, 1957. All were working iele s 
degree in education, but not necessarily in driver e M 
cation. The results of four Ss were not used on R 
basis of an a priori policy of omitting the first He 
run by each E. Five more Ss were lost due to ae 
ure to follow instructions, mechanical failure, or bers 
or wet highway during their run, The Ss ranged ie 
age from 22 to 52, in years of driving at least 10! A 
m.p.y. from 2 to 28, in average yearly mileage ae 
the past two years from 4000 to 35,000 m.p.y. a 3 
in years of driver education experience from 0 Fi 
years. Median age was 28, median years of Ae 
m.p.y. was between 11 and 12, median yearly peri 
age during past two years between 12,000 and 13; ae 
Eighteen had some driver education teaching exPe 
ence. 1986 

Apparatus. The car used in the study was a fof 
Chevrolet station wagon, in University service ies 
one year (22,000 miles), and equipped with aa 
matic transmission. The operation of the regu 
speedometer was modified so that it could be beat 
to read zero, regardless of actual car speed, by the 
positioning of a mechanical switch located under ibe 
dash at the extreme right, The appearance of r's 
regular speedometer and of the dash on the UAT, 
side remained unaltered. An auxiliary speedom e je 
was mounted on the right front dash—easily vb 
from the right front seat but masked from a 
driver’s view by a light cardboard shield fitted 4 
this speedometer. a) 2 

Two modifications were added for safety: (ios 
yellow and black cloth sign, 314 ft, rea rea! 
“CAUTION. Test car” was taped to the lower the 
side of the car; (b) a flasher was inserted I 
brake lights circuit so that E, by means of a ma eat 
Switch, could cause the brake lights to flash rep’ 
edly during the test decelerations. sja con” 

Test areas. The main test area was a 10-mile nin 
crete four-lane divided section of U.S. 127 ao 
from the edge of Holt, Michigan toward Jac spate 
Michigan and located 10 miles from Michigan dine 
University, This section had only one signalle con 
tersection (a yellow blinker) but did not have e t0 
trolled access. At several points, mostly clos yer 


Holt, extra lanes were provided for turning ms 
ments, a8 


Eleven locations or “stations” were selected 4 
deceleration points, Six of these stations G-L ink 
4, 5, and 6) were on the side of the highway ie 


362 


& 


į 


Judgments of Speed on the Open Highway 


Holt and five (7, 8, 9, 10, and II-11) were on the 
side approaching Holt. The stations were chosen so 
as to have as level a roadway as possible and to be 
as far as possible from the end of curves and the 
crest of hills (with one exception, Station 6, which 
was on a high plateau just past the crest of a hill). 
Because of these limitations only two sets of stations 
were directly opposite each other: (a) I-1 and II-11; 
(b) 5 and 7. 

Two practice sets of speed judgments were ob- 
tained on a blacktop secondary road about two miles 
from the University. 

Procedure. The major series of speed judgments 
essentially required S to drive for 20 miles at 50 
mph, slowing briefly at deceleration stations during 
the drive in order to make speed estimates. 

From the starting point (at Holt) S accelerated to 
50 mph, held that speed for about 5 sec., decelerated 
indicating when he thought the car was at 40 and 
at 30 mph, and then used accelerator action to main- 
tain his estimated 30 mph. After S had maintained 
his estimated 30 mph for about 5 sec. he was told 
to accelerate to 50 mph and continue driving at that 
Speed until further notice. He decelerated at five 
More stations on the outward run, crossed to the 
Other side of the highway and decelerated five more 
moe on the inbound run. The procedure for mak- 
ris Speed judgments was the same at each decelera- 
lee station, The distance in miles to each of the 11 
foe eration stations was, respectively, 0.6, 14, 2.6, 
Faden 26, 114, 145, 17.5, 18.3, 20.0. The set of 
moe ments obtained in this series will be called Judg- 

ent Sets 1-11. 
t gm prior to reaching each station, E disconnected 
is a gular speedometer and instructed S to remove 
Plie ce from the accelerator. Brakes were never ap- 
Omet, during these decelerations. The regular speed- 
until ¢ remained inoperative after each deceleration 
ar S had again accelerated to 50 mph. The regu- 
sup, Speedometer was operative all the 
iG one be driving at 50 mph. However, E as- 
Crea, S in maintaining 50 mph by requests to in- 
48-39 or decrease speed. The tolerance range was 
Cel mph. Additional requests were made near de- 

eration stations to keep the car within the 49-51 

Tange, 

of lor studies (2, 5) had stressed the unreliability 
fea eee judgments. On the assumption that prac- 
the E the experimental procedure would improve 
«e reliability of the judgments, four sets of speed 
fig were obtained prior to the main series of 

Bments, 


he Procedural sequence for the study was as fol- 


lows 


the pe tructions were given during | 
cop oe site. These stressed an in 
ati Perceive and judge speed while S 
teca on was not mentioned in any way. The sa eis 
ws were described, and S assured that he 
refuse to follow any instruction that he felt to 
safe, y 


drive by E to 
terest in how 
driving. Ad- 


363 


2. For the first practice set (Set A) S accelerated 
the car to 35 mph, decelerated after 5 sec. at this” 
speed, made judgments of 25 and 15 mph while de- 
celerating, and then held his estimated 15 mph for 
about 5 sec. For the second practice set (Set B) S 
repeated the procedure for Set A but held the speed 
of 35 mph for 1.1 mile. Instructions to increase or 
decrease speed were given whenever the speed devi- 
ated from the 34-36 mile range. 

3. The car was then driven by E to the main test 
area over blacktop secondary roads at a speed of 
40-45 mph. 

4. Upon arrival at the main test area, the car was 
stopped and background information collected. A 
minimum of 4 min. of no motion was required—the 
median no-motion time was 4 min. 12 sec. 

3, Judgment Set I was then obtained at the same 
location (Station I-1) and through the same pro- 
cedure as that for Judgment Set 1 of the main series 
(described above). After completing Judgment Set 1, 
S crossed to the other side of the highway, acceler- 
ated to 50 mph driving toward Holt, held this speed 
for about 5 sec., decelerated at Station II-11 making 
judgments of 40 and 30 mph, and held his estimated 
30 mph for about 5 sec. 

6. The main series of judgments was then carried 
out. 

Judgment Sets I, II, and 6 were obtained in the 
left-hand lane since a cross-over to the other side of 
the highway occurred shortly after the completion of 
each of these judgment sets. The Ss were instructed 
about 0.2 mile from Station 6 to pull into the left 
lane and remain in that lane up to the cross-over 
point, All other speed judgments were made in the 
right-hand lane unless maintenance of the 50 mph 
speed required moving into the left lane to avoid 
slow moving traffic. 

Actual car speed was observed visually and re- 
corded to the nearest mile, with an occasional re- 
cording to the nearest half-mile, for all but 9 Ss. 
For these Ss, a motion picture camera mounted be- 
hind Æ and equipped for single frame exposure was 
used to record car speed at the moment of the de- 
celeration request and at each judgment. Similar re- 
sults were obtained by the two methods of record- 
ing and the data was combined. 

Experimental design. Judgment Sets I, II, and 1 
gave three measures of minimal exposure to 50 mph 
(about 5 sec.) while Judgment Sets 2 through 11 
were intended to demonstrate the effect of increasing 
amounts of exposure to this speed. This latter com- 
parison assumed that momentary decelerations from 


50 mph would not wholly negate any adaptation 
process. i : 
The limited experimentation and observation on 


the perception of real movements suggests that the 
boundary of the field in which motion is perceived 
can strongly influence judgments of velocity (eg, 
1). Such a consideration means that the roadside 
features as well as the roadway itself should be as 
similar as possible at all stations, but especially in 


364 Abram M. Barch 
Table 1 
Speed Judgments at the Main Test Area 
Experiment I Experiment II 
Mean Judgments 50-30 Diff. Mean Judgments 50-30 Diff. 
Stations 40 mph 30 mph Mean S.D. 30 mph Mean S.D. 
5 5 15.6 2.7 
I-1 (1) 42.3 33.9 15.5 3.8 34. 

R 41.8 341 154 35 33.8 16.1 26 

a 41.4 34.5 15.3 3.0 

3 42.0 354 15.5 41 

4 41.5 34.1 15.4 3.8 
5 41.5 342 15.7 40 34.6 154 25 

6 40.3 31.4 18.3 4.7 
7 22 349 15.1 3.9 34.4 15.6 4.0 

8 42.1 34.1 15.5 4.0 

9 41.8 34.7 15.1 3.9 

10 41.4 34.0 15.7 3.9 
II-11 (1) 41.6 33.4 15.7 3.5 33.0 17.0 2.2 
(11) 404 322 168 47 33.5 16.0 3.2 


the area used for demonstrating the effect of mini- 
mal and maximal exposure to the 
(Stations I-1 and Il-11). 

that a course that doubled bi 
a continuous 20-mil 


Results 


Despite Cautions, speeds immediately prior 
to deceleration at 


much as 5 m 


al in each judgment 
eed just prior to de- 
judged to be 30 (or 
our measure of estj- 
the judged 30 (or 15) 


set between the actual spi 
celeration and the speed 
15) mph was used as 
mated speed instead of 
mph speed itself, 
Similar results followed from the use of 
either score, but the difference scores had a 
higher reliability from one judgment set to 


another. The speeds obtained by Ss in at- 
tempting to maintain 30 (or 15) mph by ae 
celerator action were not used in the analysis 
because it was found that Æ was often uncer- 
tain as to whether S had reached a stable 
held speed. d 
The results obtained for Practice Sets A an 
B are similar to those obtained at the mam 
test area and are omitted to conserve te 
Table 1 presents the mean car speed ee 
equivalent to 40 and 30 mph, the mean 50-3 
difference score, and the standard deviation 
of the difference scores for the judgment sé A 
obtained in the main test area. The presenc 
of speed adaptation, under the meaning e 
the term as used here, would be shown by a 
significant decrease in the mean 50-30 differ 
ence score from early to later judgments at 
essentially, a Significant increase in the tie 
speed reported equivalent to 30 mph as 7 
amount of exposure to 50 mph is increas a 
Table 2 presents the results of an E 
of variance of the 50-30 difference scores -se 
Judgment Sets I, II, and 1 through 11. EH 
missing observations were estimated as e 
gested by Snedecor [6, pp. 310-313].) or 
Sequential Q technique tabled by Snedeca 
(6, p. 251) was used to make compare: 
between individual judgment sets. The Ju for 
ment Sets x Subjects interaction was use 


Judgments of Speed on the Open Highway 


estimating the standard error of the difference 
for each comparison. 

A The difference score for Judgment Set 6 was 
significantly larger (.05 level) than all others; 
the difference score for Judgment Set 11 was 
significantly larger (.05 level) than those of 
Judgment Sets 7, 9, and 10. A demonstration 
of speed adaptation would require the differ- 
ence scores for the later judgments to be sig- 
nificantly smaller than those of the earlier 
Judgments. : 

Underestimation of actual speed was ob- 
tained throughout the experiment. Table 1 
clearly shows the underestimation of actual 
Speed for both the 40 and 30 mph judgments. 
This underestimation could be the result of a 
general tendency to underestimate speeds less 
than 50 mph. It is more likely due to a more 
Or less instantaneous contrast effect resulting 
from the previously experienced higher speed. 
(A reduced degree of underestimation or even 
an overestimation might be expected from 
this last hypothesis if Ss were to make speed 
judgments while accelerating instead of de- 
Celerating.) At any rate, such underestima- 
tion provides no evidence for a phenomenon 
of Speed adaptation since it did not increase 
With increasing exposure to the “adapting” 
Speed, 

Of some interest is the fact that the mean 
actual speed judged equivalent to 40 mph was 
sentially a bisection of the gap in mph be- 
een the speed just prior to deceleration and 

€ speed judged equivalent to 30 mph. The 
ratio of the mean 50-40 difference score to 
the mean 50-30 difference score for each judg- 
Ment set varied from .46 to .53. 

Table 3 lists the correlations between the 


Table 2 


Analysis of Variance of 50-30 Difference 
Scores of Experiment T 


Mean 

Source df Square 
Judgment sets 12 2243 418* 
Ubjects 34 136.79 25.52* 

Udgment sets X subjects 403" 5.36 


T 
face! 449 


= 
Signin 

a Fi leant at 01 level. 
ive df lost due to missing observations. 


365 
Table 3 
Correlation of Difference Scores from 
Various Judgment Sets 
Experiment I Experiment II 
Compari- Compari- Compari- 

sons r sons r sons r 
LHE A9 1, 11 .56 LU- 303 
Ii, i 13 ST. 89 1,5 62 
I,10 = .75 10,11 91 ie 40 
1,2 68 Sit tS) 1,11 62 
15 ey fl 6,11 79 ST .59 
1,6 45 7,11 88 5,11 .69 
id 57 7,11 .78 


difference scores for various judgment sets. 
The correlations, in general, are somewhat 
higher than might be expected in view of the 
nature of the task and the relative unfamili- 
arity of Ss with the car and the road. Corre- 
lational studies of trial-by-trial changes in the 
learning of perceptual motor skills have found 
that increasing practice results in a decrease 
in relationship between scores made on early 
trials and those made on later ones and an 
increase in the relationship between adjacent 
trials (4). Evidence for both trends was 
found here. 

Striking individual differences between Ss 
were found with the average 50-30 difference 
score for an individual ranging from 9.5 to 
24.2 mph. However, no significant relation- 
ship was found between mean difference scores 
and any of the available biographical charac- 
teristics such as age, driving mileage, years 
of driving, and driver education experience. 


Experiment II 
Method 


Subjects. The Ss were 13 male volunteers enrolled 
in the same driver education courses as those of Ex- 
periment I. Median age was 30, median years of 
1000 m.p.y. was 14, and median yearly mileage dur- 
ing past two years was 11,000. Nine had some driver 
education teaching experience. 

Procedure. The procedure for Experiment II was 
the same as that for Experiment I with one major 
modification. It was felt that the absence of speed 
adaptation effects noted in Experiment I might be 
due to (a) the use of too short a period of con- 
tinuous speed; (b) too many judgments; (c) both 
the brevity of the period of continuous speed and the 


366 


of too many judgments. Therefore, Judg- 
PTSS 2, 3, 4, 5, 8, 9, and 10 were omitted. F 
After making a set of judgments at Station I-1 an 
returning to 50 mph the Ss drove continuously at 
that speed for a median time of 8 min. 35 sec. a 
making a set of judgments at Station 5. A set o 
judgments were made at Station 7 and at II-11, with 
a median time interval of 8 min. 50 sec. at 50 mph 
these locations. 
sa sag of variance of the 50-30 difference 
scores for the six judgment sets failed to reach sig- 
nificance at the .05 level (F = 1.24 for 5 and 60 df). 
No evidence for speed adaptation was obtained de- 
spite the decrease in number of judgments and the 
longer period of constant speed. ; 
Inspection of Tables 2 and 3 indicates that quite 
similar results were obtained from the two experi- 
ments with the minor exception of the “end” effect 


found for Judgment Set 11 in the first experiment 
only, 


Discussion 


The two studies agreed in finding no evi- 
dence for speed adaptation in the speed judg- 
ments made by drivers while decelerating un- 
der the conditions of the studies. The first 
experiment found some evidence for the op- 
posite influence on speed judgments when the 
Ss were approaching a point where they were 
going to come to a stop. However, this effect 
was not noted for Judgment Sets I and II 
where the Ss also knew a stop would follow. 
The effect for Judgment 11 was also not ob- 
tained in the second experiment, 

There are a number o 
procedure might hav 


of adaptation effects. For one thing, S had 
ample opportunity to i 


Ss were 

8 Unpublished study conducted by Don Trumbo 
and Peter Hemingway with equipment furnished by 
the Highway Traffic Safety Center, g 


Abram M. Barch 


required to attain and maintain various speeds 
without specific knowledge of any speed. No 
evidence for speed adaptation was found. 

Other tenable hypotheses are that a speed 
adaptation requires longer periods of constant 
speed, constant speed higher than the rates 
used here, or both longer periods and higher 
speeds. It is also possible that speed adapta- 
tion may be far more characteristic of the 
Passenger rather than the driver in that the 
driver in his coping with various road situa- 
tions may obtain subsidiary nonconstant in- 
formation about car speed. 


Summary 


Male adult drivers, while decelerating on 
the open highway, were required to make 
judgments about the speed of the passenger 
car they were driving after varying amounts 
of exposure to a constant speed of 35 or 50 
mph. . 

The accuracy and consistency of the judg- 
ments and the influence of varying amounts 
of exposure on these speed judgments (speed 
adaptation) were studied. Such speed judg- 
ments were found to be quite reliable and ap- 
parently independent of speed adaptation. 


Received December 18, 1957, 


References 


1. Brown, J. F, The visi 
Psychol, Forsch., 1931, 14, 199-232, 

2. Forbes, T. W, Measuring drivers’ reactions. Per- 
son. J., 1932, 11, 111-119, 

- Matson, T. M., Smith, W, S., & Hurd, F. w: 
Trafic engineering. New York: McGraw-Hill, 
1955. 

4. Reynolds, B. The effect of learning on the pre- 
dictability of psycho-motor performance. 
exp. Psychol., 1952, 44, 189-198. 

$; Richardson, F. E. Estimations of speeds of auto- 
mobiles. Psychol, Bull., 1916, 13, 72-73. ) 

6. Snedecor, G. W, Statistical methods, (5th ed 
Ames: Iowa State College Press, 1956. 


ual perception of velocity- 


| 
| 


Journal of Applied Psy 
Vol. 42, Noe sr ae 


Dimensions of Job Incentives Among College Students 


A. W. Bendig and Eugenia L. Stillman 


University of Pittsburgh 


Several studies (1, 2, 4, 6, 7) have re- 
Ported the results of having groups of job ap- 
Plicants or industrial workers rank lists of 
verbal statements of job incentives. One 
major criticism of such studies is that the 
verbal job incentives used are not selected on 
the basis of any theoretical framework of 
hypothesized dimensions of job incentives, but 
each incentive is arbitrarily assumed to meas- 
Ure a dimension that is independent of the 
other incentive-measured dimensions in the 
sample. This absence of a unifying theory 
also means that variations in the wording of 
similar incentives used in different studies 
Weaken any interstudy comparisons. If it 
can be shown that several different verbal in- 
Centive statements are measuring the same 
fundamental dimensions, then an approach 
hg be made toward developing a taxonomy 

independent job incentive dimensions that 
Would aid in unifying research. At the em- 
psa level, both the methods of factor analy- 
b and of content analysis provide techniques 
a which hypothesized dimensions can be 

entifed and explored. 
of at interest is in isolating suc 
one incentives among groups of under- 
tese uate college students as an, unexplored 
am arch area related to occupational choice 
dent students and in the hope that a dimen- 
ce taxonomy developed from these more 
č ly accessible Ss may eventually be sug- 

Stive of similar studies with broader samples 

industrial personnel. As a basis for initial 
sueo theses about such dimensions, We ^S- 
tc that recent developments 1n personality 
Work. would offer some suggestions. The 
much of McClelland et al. (5) indicates that 
S ca of the academic performance of college 
Such A he related to motivational dimensions 
failure » need for achievement” and, fear of 
Stateme Many of the typical job ingenti 
i ge used in previous research o e 
Meci along a dimension that uses these two 

elland constructs to define opposite poles 


h dimensions 


367 


of this dimension. Opportunity for advance- 
ment, promotion for initiative, and similar in- 
centives would lie along the “need achieve- 
ment” segment of this dimension, while job 
security, benefits, and working conditions 
would fall at the “fear of failure” end. Con- 
sequently, a “need achievement vs. fear of 
failure” dimension seems a reasonable begin- 
ning hypothesis. 

An obvious second dimension of job incen- 
tives for college Ss is a “social service” need 
to help and assist other people. In selecting 
an occupational goal some college Ss appear 
to be more concerned with whether they will 
have an opportunity on the job to satisfy this 
need than they are with other more traditional 
job incentives. In an increasingly “other- 
directed” industrial society this dimension 
may become highly important in classifying 
job incentives. 

The study reported below is the first stage 
in a projected series aimed at developing a 
taxonomy of job incentives. We decided to 
first test the adequacy of the proposed meth- 
odology within a deliberately limited area of 
job incentives as an exploratory study of one 
hypothesized dimension. It was hoped that 
an application of the method would provide 
additional hypotheses for later developmental 


research studies. 


Procedure 


A list of the verbal descriptions of job 
ous studies (1, 2, 4, 6, 7) 
was prepared and several incentives constructed by 
the present authors were added. We decided to in- 
clude on our list only those that might, on an a 
priori basis, be expected to measure the hypothesized 
“need achievement VS. fear of failure” dimension 
and would also be applicable to college student Ss. 
Incentives apparently measuring other dimensions, 
particularly the “social service” variable noted above, 
were excluded to provide as homogeneous a list of 
incentives as possible and to reduce the factorial 
complexity of the interrelationships among the in- 
centives. The following eight incentive statements 


were selected: 


Incentives. 
incentives used in previ 


368 


Opportunity to learn new skills 

. Friendly fellow workers ches 

. Freedom to assume responsibility 

. Good job security 

. Good prospects for advancement 

6. Full insurance and retirement benefits A 
7. Recognition from supervisors for initiative 
8. Good salary 


ne 


upu 


Incentives 1, 3, 5, and 7 were included to represent 
the “need achievement” pole of the hypothesized 
dimension, while Incentives 2, 4, 6 and 8 were selected 
to define the “fear of failure” pole. A form was 
prepared to collect Ss’ rankings of these eight in- 
centives, The form requested the S to (a) record 
his name, age, sex, curriculum group or school, and 
major subject, (b) write a brief description of the 
specific job or occupation toward which his college 
preparation was oriented, (c) rank the eight incen- 
tives in terms of how important each incentive will 
be in selecting the job the S had described above 
with the most important incentive being ranked 
“one” and the least important incentive ranked 
“eight,” and (d) if the S felt that a very important 
(to him) incentive had been omitted from the list, 
he was asked to write a brief description of the in- 
centive at the bottom of the form. It was hoped 
that this last procedural task would permit Ss to 
volunteer omitted incentives such as the “social 
service” dimension and, through a content analysis 
of the volunteered Statements, provide data for 
hypotheses about other important dimensions, 

Subjects. The ranking form was distributed to 
267 student Ss (174 men and 93 women) in 10 sec- 
tions of an introductory Psychology course. Al] Ss 
filled in the form in class as requested by th 
structor who had announc 
psychology was makin 


eir in- 
ed that the department of 


umanities, social 
while the remaining Ss 
ering and business ad- 


Results 


The average rank-difference correlation 


among the 267 ranking Ss (each § correlated 
with every other S and these 35,511 intercor- 
relations averaged) was .20 which is signifi- 
cant at the .01 level of confidence. 
A sample of 100 ranking Ss was randoml 
drawn from the total group of 267 Ss without 
regard to any demographic variables 
Sex, age, or curriculum grouping. Each S’s 
tanking of the eight incentives was dichoto- 
mized by scoring the incentives ranked from 
1 to 4 as “one” and the incentives ranked 5 
through 8 as “zero,” Tetrachoric correlation 


such as 


A. W. Bendig and Eugenia L. Stillman 


coefficients were then computed among the 
eight incentives. As would be expected, the 
resulting matrix of 28 intercorrelations tended 
to be negative with the mean correlation be- 
ing — .21 and the individual coefficients rang- 
ing from .24 to — .53. This matrix was fac- 
tor analyzed by the centroid method and three 
factors were extracted. The analysis was rep- 
licated twice to stabilize the communality esti- 
mates. Inspection of the residual correlations 
indicated the absence of a fourth factor since 
the median absolute residual after extracting 
the third orthogonal factor was .12. The 
three factors were rotated to achieve orthogo- 
nal simple structure and, as far as possible, 
positive manifold. Because of the generally 
negative intercorrelations among the incen- 
tives, the factors tended to be bipolar. 

The results of this factor analysis (V = 
100) can be found in Table 1 along with the 
mean rank for each incentive (N = 267): 
The three factors accounted for 56 per cent 
of the interincentive variance with 10 of the 
24 loadings having absolute values of .40 of 
above and 10 of the loadings falling at .20 or 
below. 

Factor A appears to be the “need achieve- 
ment vs. fear of failure” variable originally 
hypothesized in selecting the incentives. Ine 
centives 1, 3, 5, and 7 (“need achievement”) 
had a median loading on Factor A of — 25, 
while Incentives 2, 4, 6, and 8 (“fear of fail- 


Table 1 


Mean Ranks (V = 267) and Factor Loadings (Decimal 
Points Omitted and Ņ = 100) for 
Eight Job Incentives 


Rotated Factors 


Incentives Tank A B Cee 
1. New skills 44-69 15 07 3 
2. Fellow workers 4.8 27 6l 07 a 
3. Responsibility 3.8 -5 44 24 a 
4. Job security 3.6 32 —17 84 F 
5. Advancement 3.4 14 —54 —S9 o 
6. Job benefits 67 a o 8# 
7. Initiative 5.4 o3 10 -54 3 
8. Salary 3.8 2-4 «2 7 
Percentage of 6 
Total Variance 18 20 18 5 


ad 


Job Incentives Among College Students 


ure”) showed a median loading of .30 with 
no overlap between these two subgroups of 
incentives in Factor A loadings. Incentives 4 
and 6 defined the “fear of failure” pole while 
Incentives 1 and 3 best measured the “need 
achievement” end of the Factor A dimension. 
Incentives 2 and 3 fell at one end of the 
Factor B dimension while Incentives 5 and 8 
measured the opposite factor extreme. As an 
a Posteriori hypothesis we might characterize 
this dimension an “interest in the job itself 
vs. the job as an opportunity for acquiring 
status” factor from an inspection of the in- 
centives defining the factor poles. We would 
expect Ss clustering at the first end of this 
factor to be concerned with whether a given 
job would be personally important to them 
and with the personal characteristics of the 
People with whom they would be working. 
The Ss at the other end of this dimension 
would regard the job as a spring-board_ for 
Upward mobility in terms of income, author- 
ity, and job title. However, since we did 
Not predict in advance the appearance of such 
a factor we can offer this interpretation only 
as an hypothesis for future research. 
A Factor C was a bipolar with Incentive 4 de- 
Ting the positive end of this dimension and 
E centives 5 and 7 falling at the negative pole. 
hoon C is difficult to “name,” but we might 
sa the guess that it concerns the attitude 
ee S toward his supervisors or employers. 
' e S at the positive pole of Factor C wants 
a be autonomous and secure in his job, while 
€ S at the other pole needs recognition an 
advancement from his immediate supervisor. 
entatively, we might label this a “job au- 
Snomy vs, supervisor dependent” factor. 
or later research it would be desirable to 
ca a single score measure of these three 
ae factors even though such a score 
ee be somewhat crude and of low validity 
in the present stage of research. By exami 
a the ranking factor loadings 1m Table 1, it 
pu be seen that the most valid and factorily 
re measure of Factor A would be obtained 
wes mputing for each S the difference be- 
tint. his rankings of Incentives © and 1. 
eae arly a factor “score” for Factor B can 
be obtained by subtracting his ranking 
Neentive 2 from his ranking of Incentive 


369 


8 and a Factor C score found by finding the 
ranking of Incentive 7 minus his ranking of 
Incentive 4. Positive factor scores computed 
in the above manner should reflect high need 
achievement, high interest in the job, and high 
need for autonomy from supervision. Nega- 
tive factor scores would measure high fear 
of failure, strong attitude toward the job as 
a stepping-stone for advancement, and high 
need for a dependency relation to the super- 
visor. 

Seventy-seven of the 267 Ss who ranked the 
incentives, or 29 per cent, volunteered addi- 
tional job incentives that they felt were not 
covered by the eight that they ranked. There 
was no sex difference in the percentage volun- 
teering incentives with 50 of the 174 male Ss 
contributing (29 per cent) and 27 of the 93 
female Ss giving additional incentives (29 per 
cent). Incomplete data suggested a possible 
verbal ability difference between the Ss who 
did and did not contribute incentives. Raw 
scores on a 30-item synonym test (five-choice 
items taken from the Cooperative Vocabulary 
Test) were available for 90 male Ss, 23 of 
whom had contributed incentives while 67 had 
not. The mean score of the contributors was 
15.5 and the mean score of the noncontribu- 
tors was 13.4. This difference in mean vo- 
cabulary scores was significant at the .05 level 
of confidence (t = 2.05). 

The 77 contributed incentives were trans- 
ferred to cards and an attempt was made to 
develop incentive categories from the mani- 
fest content of the statements. The con- 
tributed incentives were quite heterogeneous 
and only three identifiable categories, ac- 
counting for 58 per cent of the statements, 
contained a reasonable percentage of the con- 
tributed incentives. These three categories, 
plus a basket “miscellaneous” category, were: 

1. Opportunity to Help Others (25 per 
cent): opportunity to assist and guide chil- 
dren, to care for the physically ill and handi- 
capped, and to help people to adjust in their 
jobs and daily lives. 

2. Job Satisfaction (19 per cent): feeling 
satisfied with the type of job and enjoyment 
of the daily activities in a particular field. 

3. Job Interest and Variety (14 per cent): 


the amount of stimulation and challenge pro- 


370 


Table 2 


Numbers of Students Volunteering Job Incentives 
in Four Derived Categories 


Incentive Categories Total Men Women 
1. Opportunity to help others 19 7 12 
2. Job satisfaction 15 9 $ 
3. Job interest and variety 11 7 4 
4. Miscellaneous 32 27 2 
Total 77 50 27 


vided by the job and the relative absence of 
routine and repetitive tasks. 

4. Miscellaneous (42 per cent): job loca- 
tion and physical facilities, social status and 
community recognition of the job, independ- 
ence of authority, job mobility, etc, 

The presence of an important “social sery- 
ice” category (Category 1) among the con- 
tributed incentives js hardly surprising in 
view of the discussion at the beginning of 
this paper. Of interest is the appearance of 
a “job satisfaction” Category as an independ- 
ent job incentive since most industrial psy- 
chologists have used the term “job satisfac- 
tion” as the desirable end result of satisfying 
the needs of the employee by supplying the 
important incentives in the job situation. 
Many of our Ss apparently viewed “job satis- 
faction” as a more limited factor which was 
related to the day-to-day behavioral require- 
ments of the job, Category 3 (job interest 
and variety) may be peculiar to college Ss 
and to more intelligent employees who tradi- 
tionally abhor dull and monotonous job re- 
quirements. 

Although no sex difference was found in the 
total percentages of men and women Ss con- 
tributing incentives, inspection of Table 2, 
which shows the number of Ss in each sex 
group contributing incentives in each of the 
four derived categories, indicates a pro- 
nounced sex difference in Categories 1 and 4. 
Almost one half of the incentives contributed 
by the women fell in Category 1, while less 
than one seventh of the male incentives fell in 
this category. The men contributed many 
more diverse and heterogeneous incentives 
than did the women with over one half of the 
male incentives falling in Category 4 and less 


A, W. Bendig and Eugenia L. Stillman 


than one fifth of the female incentives being 
grouped in this same basket category. The 
chi-square value for this contingency table was 
11.9 which, with three degrees of freedom, is 
significant at the .01 level. No sex mre 
for Categories 2 and 3 are evident with al- 
most identical percentages of male and fe- 
male incentives falling in these two categories. 


Discussion 


The results of this exploratory study indi- 
cate that the ranking methodology used offers 
a promising approach in studying the di- 
mensions of job incentives. However, fac- 
tor analyses of incentive rankings should be 
expanded by including in the sample of in- 
centives verbal statements reflecting job in- 
centives along more dimensions than = 
included in the present sample. Additiona 
incentive statements can be written to describe 
Factors B and C and the content a 
categories found in the present study and Í 
factor analytic study of the expanded list ky 
incentive statements would provide evidenc 
as to the adequacy of our interpretations s 
these dimensions. Such an expansion of t : 
method also would clarify the descriptions of 
the factors and permit the development ms 
more reliable and factorily more valid fa ë 
tor scores, The random adding of igs 
statements that are not derived from 5 
pothesized dimensions would be a hore 
approach that would, in the long run, Ser 
the development of an adequate taxonomy 
incentives. o- 

The factor analysis extracted three ie 
nal factors which were tentatively identi e; 
as: A. Need achievement vs. fear of oe 
B. Interest in the job itself vs, the job as 
Opportunity for acquiring status; and C. nd- 
autonomy of supervisor vs. supervisor depe te 
ency. The content analysis of the contr at 
incentives resulted in three further nie gr 
Ë; Opportunity to help others; 2. Job at 
faction; and 3, Job interest and variety. de- 
though the three factor dimensions are e at 
pendent of each other, we cannot assume i 
the content categories are independent or fac- 
the categories are uncorrelated with the esis 
tors. For example, a reasonable hypoth di- 
would suggest that Factor B is the same : 
mension as Categories 2 and 3. Another 


Job Incentives Among College Students 


pothesis is that bipolar Factor A may split 
into two factors when additional incentive 
statements are included in the sample to be 
ranked. These hypotheses should be tested 
in the next developmental stage of research 
on job incentives. 

It must be emphasized that our results sug- 
gesting certain incentive dimensions and our 
interpretations of the dimensions are highly 
tentative and exploratory. At this prelimi- 
nary stage of research it seems wiser to work 
with relatively limited successive samples of 
Ss, constantly revising procedure and testing 
hypotheses derived from preceding samples, 
than to attempt a single-shot study with a 
large sample of incentive statements and a 
huge sample of heterogeneous Ss. We antici- 
pate that our present interpretation of these 
few dimensions will have to be markedly re- 
vised and expanded as research information 
accumulates. 

Finally, the results and suggestions for fu- 
ture research can be generalized only to the 
Population of undergraduate college students 
and cannot, at the present time, be applied to 
industrial situations. Dimensions that appear 
among job incentives ranked by college Ss 
May not appear in the same form or appear 
at all in similar research with industrial work- 
ers. Similarly, other dimensions may be im- 
portant in industry that fail to appear in col- 
lege samples of Ss. Only additional research 
can delimit the generalizability of results 
found for our limited sample. 


Summary 


College Ss were asked to describe their job 
goal at graduation and to rank eight selected 


371 


job incentive statements as to their impor- 
tance in choosing the job. A factor analy- 
sis of intercorrelations (W = 100) among the 
ranked incentives yielded three factors with 
the factors being tentatively identified as: 
need achievement vs. fear of failure, interest 
in the job vs. the job as an opportunity for 
acquiring status, and job autonomy of super- 
vision vs. supervisor dependency. A content 
analysis of incentive statements contributed 
by 29 per cent of the ranking Ss (N = 267) 
gave three major categories: opportunity to 
help others, job satisfaction, and job interest 
and variety. 


Received February 3, 1958. 


References 


into incentives for 


1. Ganguli, H. C An inquiry 
Indian J. 


workers in an engineering factory. 
soc. Wk., 1954, 15, 30-40. 

2. Graham, D., & Sluckin, W. Different kinds of re- 
ward as industrial incentives. Res. Rev., Dur- 
ham, 1954, No. 5, 54-56. 

3. Herzberg, F., Mausner, B., Peterson, R. O., & 
Capwell, Dora F. Job attitudes: Review of 
research and opinions. Pittsburgh: Psycho- 
logical Service of Pittsburgh, 1957. 

4. Jurgensen, C. E. What job applicants look for 
in a company. In M. L. Blum (Ed.), Read- 
ings in experimental industrial psychology. 
New York: Prentice-Hall, 1952. Pp. 107-114. 

3. McClelland, D. C., Atkinson, J. W., Clark, R. A. 
& Lowell, E. L. The achievement motive. 
New York: Appleton-Century-Crofts, 1953. 

6. Wilkins, L. T. Incentives and the young worker. 
Occup. Psychol., Lond., 1949, 23, 235-247. 

7. Wilkins, L. T. Incentives and the young male 
workers in England: With some notes on rank- 
ing methodology. Int. J. Opin. Attitude Res., 
1950, 4, 541-562. 


nal of Applied Psychology 
ea, No. 6, 1958 


Job Content and Workers’ Opinions + 


James E. Kennedy 


Bureau of Industrial Psychology, University of Wisconsin 


and Harry E. O’Neill 


Personnel Evaluation Services, General Motors Institute 


In recent years a number of writers (3, 4, 
5, 6) have expressed concern over the adverse 
effects of simplified job content on workers’ 
attitudes. Only a few empirical studies (2, 
7, 8) have been addressed to this problem 
and all support the view that the results of 
job simplification may be a source of indus- 
trial conflict. 

Since job simplification is still accepted, 
rightly or wrongly, as a cardinal principle of 
industrial management in many types of 
work situations, it becomes of interest to ex- 
plore further the kinds of simplification or 
enlargement of job content that are associated 
with changes in workers’ attitudes and opin- 
ions. 

In the course of conducting a broader per- 
sonnel research project, a survey of workers’ 
opinions toward their supervisor and the gen- 
eral work situation was administered to sam- 
ples of hourly operators in four production 
departments of an automotive assembly plant. 
In two of these departments workers were 
performing on two types of jobs of clearly 


different content. In the other two depart- 


ments workers were also performing on these 


two types of jobs, but the content of one of 
the jobs recently had been modified. This 
combination of circumstances permitted ex- 
ploring the relationship between job content 
and workers’ opinions, 

In all four departments there wa: 
cleavage between the amount of decision mak- 
ing and power exercised by salaried members 
of management compared to the hourly work- 
ers. Katz (4) has suggested that under this 
condition the thwarting of the individual’s 
self-determination and craftsmanship is more 


likely to be a problem than when there is no 
such cleavage. 


S a sharp 


1The writers are grateful to Orl 
permission to use the data and to C. 
his helpful comments. 


o L. Crissey for 
S. Bridgman for 


The Opinion Questionnaire 


The instrument used to survey opinions 
was a 71-item questionnaire having the same 
format and borrowing most of its content 
from the questionnaire developed by Comrey, 
Pfifiner, and High and described in their re- 
port of “Factors Influencing Organizational 
Effectiveness” (1). The selection of scales 
for this study from the Comrey questionnaire 
was based on the sizes of the correlations the 
original investigators had obtained between 
their various scales and organizational pro- 
duction measures as well as the appropriate- 
hess of the scales for an automotive assembly 
situation. n 

As used in the present study, the question- 
naire consisted of 14 relatively independent 
dimensions or groups of homogeneous items 
pertinent to different content areas. Three of 
the scales were concerned with more general 
aspects of the work situation, namely: pride 
in the work group, relations with other units, 
and confidence in the company. The remain- 
ing eleven scales related to the immediate su- 
pervisor’s consistency of behavior, decisive- 
pess, discipline, judgment, job competence, 
job helpfulness, receptiveness to suggestions, 
ability to organize his work, safety enforce- 
ment, relations with subordinates, and ability 
or willingness to communicate downward. 
Four additional items, not included in i 
Comrey questionnaire, referred to over-all jo’ 
satisfaction, over-all satisfaction with the SU- 
Pervisor, communications upward, and the 
quality of training that had been received. 


The Samples 
Hourly workers in four production depart- 
ments were surveyed, These departments we” 
located adjacent to one another along the a5- 
sembly line and each performed a set of com: 
parable assembly Operations as the line pass 


372 


Job Content and Workers’ Opinions 373 


through their areas. One department assem- 
bled the body, another painted it, a third ap- 
plied the trim and a fourth assembled the 
chassis and then assembled it to the body. 
These departments will be referred to as A, 
B, C, and D, respectively. The organization 
of the departments was the same, each being 
divided into foreman’s sections which con- 
sisted of 20 to 30 assembly operators and one 
or two utility men. All of the utility men 
and a 20 per cent randomly selected sample 
of assembly operators from each foreman’s 
section were surveyed. 


Differences in Job Content 


The mean survey scores of the assembly 
operators and the utility men were compared. 
The content of the two jobs differed in the 
following ways. 

Assembly operators. Each assembly opera- 
tor performed a specific task or set of tasks 
as the assembly line passed his station. The 
complete time cycle for each task was be- 
tween one and two minutes. The tasks were 
either identical for each make and model of 
car or were only negligibly different. Each 
assembly operator’s job was highly repetitive, 
routine, deskilled, mechanically paced, and 
such that the end result of his efforts con- 
tributed only an infinitesimal part of the to- 
tal process of assembling a complete car. 

Utility men. These operators were assigned 
to each foreman’s section to perform various 
utility functions. These functions were (a) 
to relieve assembly operators for scheduled or 
emergency breaks, (b) to help assembly op- 
erators who for one reason oF another were 
unable to keep up with the line, (¢) to dem- 
onstrate the job to new operators and gradu- 
ally yield parts of the job until the new op- 
erator could keep Up with the line, (d) to 
Perform temporarily the job of an assembly 
operator who was absent until other relief 
could be found, and (¢) to complete or cor- 
Tect operations done incompletely or incor- 


and the utility man job was 
Performed a single, routine an 
while the latter performed a 


these same routine tasks—as many as 20 or 


30—and performed them for lengths of time 
varying between one minute and, on infre- 
quent occasions, a day. The second major 
difference is in the fact that the assembly 
operator had, for all practical purposes, no 
area of discretion as his job was defined. The 
particular task a utility man performed at 
any given time was largely dictated by the 
situation or assigned by the foreman but an 
area of limited choice remained which, small 
as it may have been, was greater than that 
of the assembly operator. 

This difference in job content existed in all 
four departments for a number of years and 
at the time of the survey still existed in two 
departments, A and B. In the other two de- 
partments, C and D, the content of the util- 
ity man’s job was changed shortly before the 
survey was administered. The circumstances 
surrounding this change were as follows. 

As part of a broader personnel program the 
decision was made to improve the quality of 
training that was given to newly hired as- 
sembly operators, to train experienced assem- 
bly operators on several jobs in addition to 
their regularly assigned job so that they might 
be rotated on jobs when the need arose, and 
to improve work methods used by assembly 
operators. The responsibility for implement- 
ing this plan was assigned to the utility men 
and their job duties were expanded to include 
the training and methods functions. To aid 
them in performing these new functions a 
training program in work methods and train- 
ing techniques was developed. The program 
consisted of 11 one-hour lecture sessions given 
over a five week period. After this change 
took place the utility men spent about one 
half of their time on their original duties and 
one half on their newly assigned duties. In 
order to have assurance that the utility men 
would have time for performing their new 
duties, a number of assembly operators were 
upgraded to the status of utility men to help 
in the more routine tasks of providing relief, 
making repairs, etc. 

At the time the opinion survey was ad- 
ministered Department C had been assigned 
new utility men and had completed its train- 
ing by two weeks, Department D had beer 
assigned new utility men and had completec 


374 


Jomes E. Kennedy and Harry E. O'Neill 


Table 1 


arison of Mean Survey Scores for Utility Men and Assembly Operators in Four Production Departments 
Comp: p! 


Dept. A 


Dept. B 


Dept. C a Dept. D ) 
(No Training) (No Training) (Completed Training) (In Training 
N Mean SD N Mean SD N Mean SD N Mean SD 
7 57 62 
5.5 7 4 246.81 36.13 41 257.32 29. 
ility M 32 233.94 35.56 33 232.36 47.33 14 1 eae 
Fe TM 99 240.93 40.32 78 233.82 42.26 135 233.15 42.63 103 237,38 49. 


half its training, and Departments A and B 
had not been assigned new utility men nor 
had they received any training. 


Results 


The first comparison was made between the 
mean total survey scores of the assembly op- 
erators and the utility men in Departments A 
and B. In view of the general belief that job 
simplification gives rise to unfavorable atti- 
tudes it would be expected that the assembly 
operators, performing on the more simplified 
jobs, should hold less favorable opinions of 
their supervisors and the work situation in 
general than the utility men. The results 
seen in the first two columns of Table 1 indi- 
cate that there was no significant difference 
between the means in either department. 
(Department A, ¢ = 89 and Department B, 
t= 15.) 

The scales from the questionnaire were di- 
vided into those concerned with the more gen- 
eral aspects of the work situation (pride in 
work group, relations with other units, con- 
fidence in the company and satisfaction with 
the job itself) and those concerned with the 
immediate supervisor., Comparison of the two 
groups of workers on both of these subtotals 
showed no significant differences in either de- 
partment. 


Comparison between the assembly opera- 


in Departments Cc 
tility men in both 


ntly more favorable 
opinions as reflected in the mean total survey 


scores, These results are found in the third 
and fourth columns of Table 1. (For De- 
partment C, ¢ = 2.43, significant at the 5 per 
cent level, and for Department D, ¿= 2.92, 
significant at the 1 per cent level.) In De- 


partment C the mean subscores of the scales 
relating to the work situation in general were 
significantly higher for utility men at the 10 
per cent level (¢ = 1.84) and those relating 
to the supervisor at the 1 per cent level (¢ = 
2.63). In Department D the differences were 
significant at the 5 per cent level (¢ = 2.61) 
and the 1 per cent level (¢ = 4.09) for the 
general and supervisor subscales, respectively. 

Since both pretraining and posttraining 
measures were not available for these depart- 
ments, we cannot dismiss entirely the possi- 
bility that the utility men’s scores were not 
higher before training. Indirect evidence 
suggesting this was not the case is found in 
the fact that in the two departments (A and 
B) where pretraining measures were available 
no significant differences in the total scores of 


utility men and assembly operators were ob- 
served. 


Discussion 


Contrary to general expectations no dilter 
ence was observed in the favorableness © 
opinions of the workers on two jobs of clearly 

ifferent content in Departments A and B. 
The question might be raised as to wieme 
the survey instrument was sensitive enoug 
to reflect the kinds of differences that might 
reasonably have been expected. Comparison 
of the scores for utility men and the assembly 


Operators in Departments C and D suggests 
that it was. 


The fact that 
abl 
the 


utility men had more favor- 
e attitudes than assembly operators 17 
se two departments may be accounted for 
Y any one, or combination of, the following 
kinds of influences. 

he mean scores for utility men in Depart- 
ments C and D were based on the question- 


D all 
F 


Job Content and Workers’ Opinions 375 


naire scores of two types of utility men: those 
who recently had been upgraded from the job 
of assembly operator and had not received the 
training program and those who recently had 
been assigned the responsibility for training 
and methods and had received (or were re- 
ceiving) the training program. 

In the case of the newly appointed utility 
men an expansion of job duties and change in 
status was involved. Either of these factors 
may have influenced their opinions favorably. 
From the results in Departments A and B 
these effects would not be expected to persist. 

In the case of the experienced utility men, 
the expansion of their duties to include the 
more challenging functions of methods and 
training may have brought about the more 
favorable attitudes. The change in job con- 
tent was such that entirely new skills and 
knowledge would be involved and consider- 
ably more leeway in decision making and self- 
scheduling required. Since the content had 
been changed so recently the permanence of 
any such effect would be in considerable 
doubt. 

_An alternate explanation might be the pos- 

sible operation of a Hawthorne effect. The 
favorable effect might not have been the ex- 
Pansion of job duties as much as the fact that 
management had singled out this group for 
Special treatment. The training program in 
question was the first time that management 
had taken hourly operators off their jobs to 
Provide them with classroom training. 

In any case, the survey did appear to be 
sensitive to the kind of differences that would 
have been expected if the original groups of 
Utility men and assembly operators in De- 
Partments A and B had held different opin- 
ions as a result of differences in their job 
Content. 


Summary and Conclusions 


Assembly operators performing highly rou- 
tine and repetitive tasks held no less favor- 
able opinions toward their supervisors and to 
the work situation than did utility men per- 
forming a wide variety of these routine tasks. 
The instrument used to survey these opinions 
was seen to be sensitive enough to show dif- 
ferences between assembly operators and util- 
ity men when utility men were singled out by 
management for special treatment and had 
their job duties further expanded. If job con- 
tent is a factor in determining how favorably 
workers view their supervisor and their work 
situation, the difference in content apparently 
must be along more fundamental dimensions 
than those observed in this study. 


Received February 7, 1958. 


References 


1. Comrey, A. L., Pfifiner, J. N., & High, W. S. 
Factors influencing organizational efective- 
ness: A final report. Los Angeles: Office of 
Naval Research, Univer. Southern California, 
1954. 

2. Davis, L. E. Job design and productivity: A new 

approach. Personnel, 1957, 33, 418-430. 

3. Haire, M. Psychology in management. 
York: McGraw-Hill, 1956. 

4. Katz, D. Satisfactions and deprivations in indus- 
trial life. In A. Kornhauser, R. Dubin, & 
A. M. Ross (Eds.), Industrial conflict. New 
York: McGraw-Hill, 1954. 

. Kretch, D., & Crutchfield, R. S. Theory and 
problems of social psychology. New York: 
McGraw-Hill, 1950. 

6. Mann, F. C., & Hoffman, L. R. Individual and 
organizational correlates of automation. J. 
soc. Issues, 1956, 12, 7-17. 

7. Morse, Nancy C. Satisfaction in the white collar 
job. Ann Arbor: Survey Res. Center, 1953. 

8. Walker, C. R., & Guest, R. H. The man on the 
assembly line. Cambridge: Harvard Univer. 


Press, 1952. 


New 


in 


Applied Psychology 
Se T 


A Study of the Purdue Non-Language Adaptability Test 


W. F. Seibert 


Audio-Visual Center, Purdue University 1 


The development of each new mental meas- 
urement device brings with it the necessity for 
a variety of studies which supply information 
adequate to describe relevant performance 
characteristics of the new device. The Purdue 
Non-Language Adaptability Test (hereinafter 
referred to as the PNLAT) is one such new 
device and may be classified as a non-language 
group test of mental ability, intended for busi- 
ness and industrial use. Previous reports (1, 
2, 3) have supplied PNLAT norms, reliability 
and validity estimates, and information con- 
cerning various other characteristics of this 
test.” The writer feels that the present study 
makes some additional con 


tributions and may 
be of interest to those contemplating use of 
the PNLAT, 


Problem 


This study was conducted to provide at 
least partial answers to the following ques- 
tions: 1. What is the predictive validity of the 
PNLAT in a training situation which is fairly 
typical of many found in business and in- 
dustry? 2. How does the predictive validity 
of the PNLAT compare with that of a more 
common language intelligence measure? 3. 
Does the test exhibit a reasonable degree of 
construct validity, i.e., does the PNLAT bear 
an acceptably high relationship to another 
measure of intelligence? 4, Does the PNLAT 
correlate significantly with variables which 
should be almost independent of intelligence? 
Question four has been divided into two parts, 
First, is the PNLAT correlated with near 
distance, binocular visual acuity? The con- 
tents of PNLAT items suggest that such a re- 
lationship might exist.® Second, in a mature, 


1 This study was conducted while the writer was at 
the Purdue Calumet Extension Center, Hammond, 
Indiana. 

®No journal articles concerning the PNLAT have 
appeared prior to this and readers are therefore re- 
ferred to the three relevant theses, These theses may 
be obtained on loan from the Purdue University 
Library, Lafayette, Indiana. 

2 The PNLAT may be described 


as a 36-item, 15- 
minute time limit group test. 


Each item consists of 


presenile group, does a relationship exist be- 
tween PNLAT and the age variable? 


Procedure 


The sample. Test subjects were 62 male students 
enrolled in a night course in introductory psychology 
at the technical institute level. Ages ranged from 
18 to 48 years, with a mean of 29.8, a standard devia- 
tion of 8.4, All but two of the Ss were employed 
full time in business, industry, or government. The 
two unemployed Ss were enrolled as full-time stu- 
dents. Thirty seven per cent of the group were em- 
ployed in the steel production industry, 26% in 
machinery or automotive production, 19% in the 
petroleum industry, 10% in public utilities, and 5% 
in miscellaneous businesses or government work. 
Eighteen per cent held supervisory positions. Thirty 
seven per cent were salaried, 60% were hourly paid. 

The average number of years of formal education 
was 11.8 and the group was extremely homogeneous 
with respect to this variable. Seventy-one per cent 
had completed exactly 12 years of education and an 
additional 13% had completed either 11 or 13 years- 

Nature of the training. The introductory psy- 
chology course in which the Ss were enrolled was 
taught in 16 two-hour sessions and is part of various 
Purdue Technical Institute curricula. The cours¢ 
dealt with most of the usual topics to be found in 
introductory psychology courses, e.g. perception, 
emotion, and intelligence, but emphasized applications 
of psychological principles to problems of worker 
Supervision and the understanding of fellow Worker 

The composition of the sample and the level a 
which the course was taught suggest that this situa 
tion has much in common with many business an! 
industrial training situations. d 

Obtaining the data. Each of the 62 Ss complete 
a personal information questionnaire at the beginnint 
of the fourth class meeting. Immediately tollat 
this, the PNLAT* and the Adaptability Test, For 


of 
a set of 10 geometric designs or patterns, four 
which are identical, In all but seven of the ie 
the boundaries of Patterns within an item are 1 ats 
tical and discriminations must be based upon clan 
of visible detail within the boundaries. The S yas 
quired to identify the four identical alternati e 
within a given group of 10 and to indicate ae 
choices by marking a large cross through the alter 
tives chosen. RS, 

*The test administered was the final prelimine 
form of the PNLAT, then designated as the N ‘All 
(Non-Language Personnel Test), Form AB. the 
items in the preliminary and published forms of 
test are identical. The two forms differ only in 
contents of their cover Pages, 


376 


Pe 


Purdue Non-Language Adaptability Test 


A (6), were administered. The Ss received and com- 
pleted the tests in approximately counter-balanced 
order. Thirty-four of them took the Adaptability 
Test prior to the PNLAT and 28 took the two tests 
in reverse order. Recommended testing procedures 
were followed. 

„Visual acuity was tested with the near distance, 
binocular visual acuity subtest of the Bausch and 
Lomb Ortho-Rater. Individual testing sessions were 
arranged at the convenience of the Ss and extended 
throughout the 16-week semester. The Ss were 
tested with or without spectacles, depending upon the 
condition which had prevailed when the intelligence 
tests were administered. 

The criterion used in estimating the predictive 
validity of the PNLAT consisted of scores on 120 
multiple choice achievement items covering material 
taught in the introductory psychology course. Scores 
were not corrected for chance success. The 120 items 
were administered, untimed, as two tests (mid- 
semester and final examination) and the odd-even 
reliability of this criterion, S-B corrected, was found 
to be 78. There was no contamination of the cri- 
terion through illicit use of predictor data. 

The student group’s average scores and standard 
deviations of scores for the tests which constitute 
the main variables of this study are presented in 
Table 1, 


Results and Discussion 


Table 2 presents the Pearson product- 
Moment coefficients of correlation upon which 
the major portion of the conclusions of this 
Study are based. 
fo Concerning the first of our questions, it is 
an that the PNLAT correlates .367 with 
the achievement test criterion and that this 7 
1S Significantly different from zero beyond the 
01 level of confidence. Therefore, it may be 
Concluded that the PNLAT has demonstrated 
Statistically reliable predictive validity. 

The second question concerns the relative 
tffectiveness in prediction of the PNLAT and 
a Adaptability Test. Here, Hotelling’s F 
est (4, p. 54) was used to test the null hy- 


Table 1 


Test Score Averages and Standard Deviations 


-S (N = 62) 

Average Standard 

Test Name Score Deviation 
ANLAT 20.48 5.49 
Org bility Test 18.74 5.43 
rite Rater, visual acuity 10.19 1.38 
rion achievement test 66.76 1037 


377 


Table 2 


Correlations Between Variables 


Criterion (D) (C) (B) 


Age (A) —.012  —.286* —.192 —.326** 
PNLAT (B) S67 156 433%" 

Adapt. Test (C) .597** = .290* 

Vis. Acuity (D) .297* 


* Significant beyond .05 level. 
** Significant beyond .01 level. 


pothesis that the predictive validity coefficients 
of the PNLAT (.367) and the Adaptability 
Test (.597) are not different. The obtained 
F ratio is 44.26 (with 1 and 59 df), significant 
beyond the .001 level of confidence. It must 
be concluded that, in the present case, the 
Adaptability Test was a significantly better 
predictor than the PNLAT. 

The correlation of .433 between the PNLAT 
and the Adaptability Test gives evidence of 
the PNLAT’s construct validity and pertains 
to question three. This coefficient is sig- 
nificantly different from zero beyond the .01 
level. 

The first part of question four deals with 
the relationship between scores on the PNLAT 
and the near distance, binocular visual acuity 
subtest of the Ortho-Rater. The obtained r 
of .156 is not significantly different from zero. 
Also, it is remarkably similar to the 7 re- 
ported by Albright (1, p. 20), who obtained 
a nonsignificant correlation of .136 between 
the same two variables. Therefore, it seems 
reasonable to conclude that, notwithstanding 
the apparent visual demands of the PNLAT 
test items, test performance is essentially un- 
influenced by S’s visual acuity. As a matter 
of incidental importance, it may be seen that 
both the Adaptability Test and the criterion 
achievement test correlated positively and sig- 
nificantly with measured visual acuity. Neither 
of these findings is readily explainable nor 
were they anticipated. 

The correlation of —.326 which was found 
to exist between PNLAT scores and chrono- 
logical age relates to the second part of ques- 
tion four. This 7 is significantly different 
from zero beyond the 01 level of confidence. 
It suggests the likelihood of problems in ap- 
plying the PNLAT in situations where marked 


378 


ge variations exist in the sample of Ss. Per- 
N age norms such as Wechsler (7) and 
others have used would be helpful to PNLAT 
users. g 
Although predictive validity coefficients 
greater than that in the present study have 
been frequently obtained, the r of .367 which 
was found to exist between PNLAT and the 
criterion represents a significant and, in some 
cases, useable amount of prediction. Further- 
more, it seems that there are at least two 
reasons for considering the present prediction 
situation as less than an ideal one in which to 
use a non-language test. First, the student 
sample exhibited a relatively high level of 
formal education and such students might be 
expected to display their talents on verbal 
tests without serious disadvantage. Second, 
the criterion to be predicted contains an un- 
questionably large verbal component which is 
not likely to be satisfactorily measured by 
nonverbal tests. It seems reasonable to sup- 
pose that the PNLAT will be of most value 
in situations involving subjects with limited 
knowledge of a language and involving a cri- 


terion which contains no significant verbal 
component. 


Lindner and Gurvitz r 
92 between Wechsler 
and Revised Arm 
Such outstanding 


eport a correlation of 
intelligence quotients 
y Beta scores (5, p. 654), 
(!) coefficients are not nor- 
mally to be expected from investigations of 
construct validity, and it would seem that the 
correlation of .433 which Was obtained in the 
present study satisfactorily establishes the 
presence of construct validity in the PNLAT. 
Householder reports a median correlation of 


W. F. Seibert 


-446 between the PNLAT and various verbal 
mental ability tests (3, p. 50). 

The findings of this study indicate that the 
PNLAT performs about as well as might be 
expected of it. It demonstrated both predic- 
tive and construct validity and absence of 
significant contamination by variations in 
visual acuity. Its apparent discrimination 
against older test Ss does not pose problems 
which cannot be satisfactorily resolved in most 
test usage situations. It would seem that the 
PNLAT merits additional research and use 
and it is likely that, properly applied, the 
PNLAT can be a valuable addition to per- 
sonnel selection and placement operations. 


Received February 10, 1958. 


References 


- Albright, L. E. The development of a selection 
Process for an inspection task. Unpublished 
doctoral dissertation, Purdue Univer., 1956. 
cl, R. E. The standardization of a non-lan- 
guage personnel test for industry. Unpub- 
lished M.S, thesis, Purdue Univer, 1953. _ 
- Householder, M, F, The development of empirical 
indices of validity and reliability of the Purdue 
Non-Language Adaptability Test. Unpublished 
doctoral dissertation, Purdue Univer., 1957. 
4. Johnson, P, O, Statistical methods in research. 
New York: Prentice-Hall, 1949, A 
- Lindner, R. M, & Gurvitz, M. Restandardization 
of Revised Beta Examination. J. appl. Psy- 
chol., 1946, 30, 649-658. ” 
6. Tiffin, J., & Lawshe, C. H. The Adaptability Test: 
A fifteen minute mental alertness test for ae 
in personnel allocation. J, appl. Psychol., 1943+ 
27, 152-163, ; 
7. Wechsler, D. The measurement of adult intelli- 


gence. (3rd ed.) Baltimore: Williams and 
Wilkins, 1944, 


2. Eid 


ww 


a 


a 


Journal of Applied Psych 
Vol. 42, No. 6 1938 eee 


The Additivity of the Times for Human Motor Response 
Elements in a Simulated Industrial Assembly Task 


Elwood S. Buffa and John Lyman 


University of California, Los Angeles 


If accurate predictions of human perform- 
ance times for psychomotor tasks can be 
made, alternative designs for machines, tools, 
and workspaces may be evaluated without 
actually building experimental models. The 
obvious and substantial economic advantages 
to be gained have led to the construction of 
a number of standard data time systems by 
ae in the field of work measurement (1, 

). The validity of such systems has been 
under constant attack, however, because of 
doubts about the basic assumption of addi- 
tivity of the motion elements when these ele- 
Ments are applied in cycles and sequences 
different from those in which the data were 
Sathered originally. The principal basis for 
Supporting these doubts has come from the 
Pa of a number of experiments by differ- 
E investigators which rather unequivocally 
pred that interaction exists among the times 
a Sements, that is, element time is not only 
die of certain major variables such as 
o ance, class of fit, etc. but also a function 
8 ee elements in the cycle (e.g. 2, 4 5, 
oui A 10). , In general, the results of the vari- 
S investigations have suggested that the in- 
a action is mainly with the elements immedi- 

ely adjacent to the element in question. 
te bones from the viewpoint of element in- 
e on or correlation, the concept of addi- 
w ity appears to be limited, it would appear 
Sapa to ask if the mean cycle time for 
ay ro could be predicted from known stand- 

d element times despite interactions that 
et make a given element time inaccurate 
the particular situation. The rationale for 
stati easibility of such a possibility 1s that in 

istics it ig well known that a set of ran- 
i, variables, x, y, and 2 connected by a 
expe density function, f (*, V» z), will yield an 

s cted value of their sum that 1s equal to 
par a of their separate expected values, re- 
he less of interaction between the variables, 

E (s4 y4 2) = E (£) + E (9) FE (2) 


379 


(5). If the motion element times in stand- 
ard data can properly be assumed to be ran- 
dom variables it should be true that mean 
cycle times can be predicted from a knowl- 
edge of expected (mean) values of the indi- 
vidual elements. The mean total cycle time 
predicted in this manner will, of course, have 
a variance, part of which is accounted for by 
the element interaction or intercorrelation. 
Knowledge of the degree of correlation would 
permit a more precise estimate of the cycle 
time but it remains a question as to whether 
or not the variance due to element correla- 
tions increases total variance to the point 
where estimates of mean cycle times are not 
sufficiently precise to use. 

Two studies, one by Ghiselli and Brown (3, 
p. 369) and one by Stiling (11), have direct 
bearing on the matter. The results of the 
first, which involved two patterns of key tap- 
ping sequences in which some of the keys were 
eliminated from each pattern and the pre- 
dicted and actual cycle times compared, sug- 
gested that additivity does not hold. Appar- 
ently only one S was used, however, and no 
statistical treatment of the data was avail- 
able. In Stiling’s study, 24 Ss performed a 
simple task which involved travel and ma- 
nipulation motions in aligning a pointer to a 
dial scale marking by means of a rotary knob. 
In the first task 5 dials were manipulated. 
In the second, one of the dials was left out of 
the sequence, in the third, two, and in the 
fourth task, three dials were left out. Two 
levels of alignment difficulty were used. The 
basic hypothesis was tested by predicting the 
times for the successive reduced cycles from 
measurements made in the complete cycle 
and comparing these predictions with the 
actual times taken for the reduced cycles. 
The conclusion was that the differences be- 
tween the actual and estimated cycle times 
were explainable as chance variation, that is, 
additivity held for the conditions of the ex- 


380 


Elwood S. Buffa and John Lyman 


DUAL |FEEDER 


PART, A 


=] 


MANUA 


WELL FOR PART A 


O 


ADVANCE 


TON 


o, 
2 


Fic. 1. Workplace layout, 


periment. One might point out that in this 
experiment the motion pattern was not com- 
plex although the task did require some ma- 
nipulation. In addition, direction of motion 
was a confounded variable since the direction 
of the first travel motion changed with each 
successive reduction in the task. 

As the results of these two studies were in 
direct conflict, the question of element addi- 
tivity remains open. In addition both studies 
involved simple motion patterns which were 
not representative of the complexity of typi- 
cal industrial tasks. The present study was 
designed, therefore, to attempt to resolve the 
differences and extend the applicability of the 
additivity concept to more complex motion 
patterns which might be expected to influ- 
ence the interaction of the moti 


ion elements 
composing it. 


Method 
Design of Experiment 


Figure 1 shows the work 
used to test the hypothesis of additivity, The task 
required the S$ to assemble two Parts in a fixture, 
Part A was a pin which was an inch and a half long 
and a quarter of an inch in diameter, chamfered on 
one end, Part B was a bushing, three quarters of 
an inch in diameter, one-half inch thick, and with 
a concentric hole of 0.26 in. The cycle for the right 
hand is given below: 


place layout which was 


1. Reach to Part A 

2. Grasp Part A E 

.« Move to manual advance button, depress bu 
ton 

- Move Part A to Hole 3 in fixture 


4 

5. Position Part A in hole and release 
6. Reach to Part B 
7 
8 


. Grasp Part B 

- Move Part B to Hole 4 in fixture 

. Position Part B in hole and release 

10. Reach to Part A in Hole 3 

« Grasp Part A and remove from Hole 3 
12. Move Part A to Part B in Hole 4 

13. Position Part A in Part B 

14. Release Part A 

. Reach to disposal button 

16. Depress button 


Part A was fed to the S from a dual feeder at * 
rate determined by S. The S took Part A from 3a 
well, pressed a manual advance button and hae 
moved Part A to the fixture and positioned it T 
Hole 3. Part B, which was a bushing, was then ra 
cured from its rack by S and moved to the fixtu n 
and positioned in Hole 4, Part A was then ee 
from Hole 3 and Positioned in the center hole 
Part B which was in Hole 4. The S then sce 
the disposal button energizing a solenoid which w1 A 
drew a slide causing Parts A and B to drop inte 
disposal chute. The table height was set at 28 ee 
and an adjustable chair was used to vary the Mall 
tive height of the S in relation to the table. In 1- 
instances the chair Was adjusted so that the S’s 2 
bow was about 3 in. above the table top with ere 
sitting posture. 


The basic hypothesis was tested by measuring 


=. 


= 


Human Motor Response Elements 


times in a complete and incomplete cycle. The com- 
plete cycle consisted of all of the elements 1 through 
16. Elements 9 through 14 were measured sepa- 
rately. The times for Elements 9 through 14 were 
subtracted from the total complete cycle to yield a 
forecast of the time for an incomplete cycle. Ele- 
ments 1 through 8 plus 15 and 16 were measured 
under the conditions of the incomplete cycle. If 
additivity held, the mean times measured in this way 
should equal the mean times of the cycles where the 
Elements 10 through 13 were not a part of the 
manual cycle. Element 9 was measured separately 
and its time eliminated from both incomplete and 
complete cycles because of details in the instrumen- 
tation system which made it overlap in both com- 
plete and incomplete cycle conditions. The contrast 
between the times for Elements 1 through 8 plus 15 
and 16 measured under complete cycle conditions, 
yeth all elements included and incomplete cycle con- 
itions where Elements 10 through 13 were not part 
Of the task, represents a measure of how closely ele- 
ments measured in one sequence can predict the 
times required to perform the same elements in a 
different sequence. The validity of the concept of 
additivity is thus tested for the particular task situa- 
tions examined. 
Variables of discrimination and 
Star vs. both hands) were also 
ae was introduced by 
i around Part B. In half of 
i eee to take only parts with 
therefore a measure of the e 


liands used (right 
introduced. Dis- 
painting a yellow 
the trials the S was 
n the yellow band, 
i ffect of discrimina- 


Paes provided by a contrast of the times where 
wher discrimination was required compared to those 
Used no discrimination was required. The hands- 
Wis variable compared cycle times where the task 
imi Performed with the right hand only with cycle 
with where the task was performed symmetrically 
er hands, i 
"he three variables: cycle completeness, iscrimi- 
TAN and hands used, were combined in a full 
aes experiment with two levels for each factor. 
e following measurements were made: 


z Total cycle times. 

- Time for the added eleme 
curred in only the complete Cyc 
through 14, 

ime for the grasp plus tram 
Part B bukin, Elements 7 and ' 
* Time for the release of Part A (pin) after its 
insertion in the bushing hole (these times 0c- 


Curred in only the complete cycle), Element 14. 

ime for the position plus release of Part B 
(bushing) in the fixture (these times were meas- 
ured in only the incomplete cycle), Element 9. 


nts (these times 0c- 
Je), Elements 9 


sport loaded of 


In 
s Tumentation 
2 . s 
Recess, instrument the work place of Fig. 1 it was 
both Y to be able to measure total cycle time in 


additions complete and the incomplete cycles and the 
nal elements required to perform the com- 


381 


plete cycle as compared to the incomplete cycle. 
Measurements were accumulated for 20 cycles in all 
instances so that each measurement represented an 
average. The total cycle times were accumulated on 
a precision clock. In order to measure the addi- 
tional elements required to perform the complete 
cycle a combination of three Eccles-Jordan elec- 
tronic switches and counters was required. The 
electronic switches were triggered by a 100 ke elec- 
trical pulse which was generated by a transmitter 
under the S’s chair and traveled over the surface of 
his skin, The electronic switches and counters had 
a basic accuracy of .001 sec. All instrumentation 
was put into operation by turning a multiple rotary 
switch. This was done by the experimenter and was 
synchronized with the pressing of the disposal but- 
ton. All time measurements were made for the right 


hand. 


Subjects and Routine 


Sixteen male university students with preferred 
right-handedness served as Ss. Each S was sam- 
pled twice for each treatment combination, using an 
f 20 cycles taken after a standard 
A comparison of the 
ded a measure of 


accumulation 0! 
practice period of 50 cycles. 
first sample with the second provi 
practice effects. 

In pretests 0 
changes in the task sı 


f the task, it was noted that major 
everely affected the S's ability 
to perform effectively. These negative transfer ef- 
fects were particularly large for changes from the 
complete cycle to the incomplete cycle and vice 
versa. It was the tendency of Ss to introduce ex- 
trancous elements in the incomplete cycle and to 
omit them in complete cycles, This was also true, 
to a lesser extent, for changes in the hands-used 
variable, Because of this the variables were counter- 
balanced. Half of the Ss (chosen at random) were 
presented with the sequence, complete-incomplete, 
and the other half with the sequence, incomplete- 
on a two-day schedule. Each group was 
ded so that half of them were pre- 
sented with a right-hand-both-hands sequence, and 
the other half with the reversed schedule. Within 
this structure, the presentation sequence was ran- 


domized. 


complete, 
further subdivi 


Results and Discussion 


The over-all means and standard deviations 
for the basic data are presented in Table 1. 

Preliminary analysis of the raw data for the 
five measurements by use of Bartlett’s test 
for homogeneity of variances indicated a re- 
jection of homogeneity as a hypothesis for all 
measurements except No. 4, time for the re- 
lease of Part A. To obtain homogeneity of 
variances a logarithmic transformation of the 
time scale was used in the detailed analysis of 
variance calculations, except in the case of 


32 Elwood S. Buffa 


and John Lyman 


Table 1 


Means and Standard Deviations for Measured Elements 
(Seconds per 20 cycles) 


Incomplete Cycle 


Complete Cycle 


Mean SD 


Mean SD 


‘Sample Sample 


Sample Sample 


Sample Sample Sample Sample 


1 2 
Item Measured 1 2 1 2 1 2 

5 89 4.67 462 97 89 
Cycle* 4.71 4.59 99 89 4 a 
Fierens 7 and 8 16.91 16.64 4.59 4.55 16.23 15.83 4.84 te 
Element 14 = = = = 4.65 4.67 1.67 209 
E t9 6.66 6.59 1.59 1.69 
ok 9-14 — — — —- 28.31 28.04 4.93 5.15 


a Cycle times are in seconds per cycle, 


measurement No. 4, P was set at the .01 
level. 

The partitioning of degrees of freedom re- 
quires the separation of subject effects (S), 
treatment effects (T), order effects (O), and 
the interactions of the main effects. O re- 
flects possible differences between Work Sam- 
ples 1 and 2, and as such represents practice 
effects during the experiments. The treatment 
effects can be further partitioned into the 
main effect and interactions of the experi- 
mental variables which are of particular in- 
terest in these experiments. Table 2 shows 
the analysis of variance for cycle times. The 
S X T interaction was highly significant, and 
thus it becomes the appropriate error term 
for testing the significance of S and of T, 


since we are interested in T independent of 
S and vice versa, 


Table 2 
Analysis of Variance, Log Cycle Times 


Sum of 


Mean 
df Squares Square F 

Subjects (S) 15 0.33485 0.02232 9.30* 
Treatments (T) 7 1.14546 0.16363 68.10* 
Order (0) 1 0.00335 0.00335 617 
SXT 105 0.25248 0.00240 4.42 
sxo 15 0.00746 0.000497 0.92 
TXO 7 0.00471 0.000673 124 
SXTXO 105 0.05700 0.000543 

Total 55 1.80531 

*p <01. 


The F ratios for S and for T are both 
highly significant. The S is common in m 
type of experiment, but it is T independent 0 
S which is of principal interest here. The 
latter are displayed in Table 3. In comput- 
ing the F ratios in Table 3, the mean a? 
for SXT of 0.00240 with 105 degrees © 
freedom is the divisor, 7 

The most important implication of the i 
sults shown in Table 3 is that cycle com 
Pleteness is not significant. In other words, 
the actual incomplete cycle times were we 
cast by measurements made in the complete 
cycle and any differences between actual a 
forecast incomplete cycle times may be ys 
counted for by chance. This result ag 
the presence of highly significant effects a 
the discrimination and hands-used variate 
and for the discrimination by hands-used i 
teraction. ns 

A logical question at this point concen 
the magnitude of the added elements, for; a 
they were minute, it could be argued on 
forecast incomplete cycle time and actual 
complete cycle time were equal beta 5 
insignificant element was subtracted from ta 
complete cycle. An examination of the m 
shows, however, that the mean of added = A 
ments of 1.41 sec. per cycle was about 3 n 
of the incomplete cycle time of 4.64 sec- ie 
about 23% of the mean complete cycle san 
of 6.05 sec. This is hardly an insignific 
Proportion, 1 

Separate analysis of data for Eemien P 
and 8 (grasp plus transport loaded of 


= 


e 


> 


Human Motor Response Elements 


Table 3 


Log Cycle Time, Treatment Effects 


Mean 
df Square F 

Cycle Completeness (C) 1 0.000131 .00546 
Discrimination (D) 1 0.1489 62.04* 
Hands Used (H) 1 0.968 403.00* 
CXD 1 0.00150 0.625 
CXH 1 0.000581 0.242 
DXH 1 0.0211 8.79* 
CXDXH 1 0.00548 2.28 

Total 7 

“p <.01, cara 


B) indicated that cycle completeness was sig- 
nificant at p < .01. This means that for 
grasp plus transport loaded the elements that 
Preceded or followed in the sequence appar- 
ently affected the time. This is evidence that 
element interaction existed in the experiment. 
This strengthens the conclusion that addi- 
tivity may exist for mean cycle times even 
though element interaction exists. Again, the 
estion of magnitude arises. Grasp plus 
a saa loaded contains significant effects 
nears to cycle completeness and yet forecast 
thi actual total cycle times are equal. Can 

is be accounted for because the time for 
ips plus transport loaded is so small that 

S effect is masked in a long over-all cycle? 
io e mean time for grasp plus transport 
pe was 11.8% of the total cycle mean 
el thus represented a relatively important 

ement in the cycle. We have concluded, 

€refore, that within the limits of this ex- 
vant additivity has been shown to be a 
ag Concept for use in predicting total cycle 
be e for light manipulatory tasks involving 

Veral motion elements. The results imply 
on it might be worthwhile to make large 
ers ig population studies of industrial work- 
ing th obtain reliable standard data for mak- 
desi e predictions necessary for the effective 

i8n of new tasks. 


Summary 


ri Purpose of this study 
tiene whether or not additivity O 
S holds for over-all cycle time pre 


was to deter- 
f motion ele- 
dictions 


383 


despite interactions among the elements. Pre- 
cise time measurements were made in a light 
manual assembly task requiring 16 motion 
elements in the complete cycle and 10 motion 
elements in the incomplete cycle for 16 male 
Ss. The results indicated that total incom- 
plete cycle times predicted from data obtained 
in the complete cycle did not differ signifi- 
cantly from times actually measured even 
though there was evidence of interactions 
among the motion elements and the variables 
of discrimination and hands-used (one-handed 
versus two-handed performance). It was con- 
cluded that additivity of motion elements 
does, indeed, seem to be a valid concept 
where several motion elements are involved. 


Received February 14, 1958. 


References 


1. Barnes, R. M. Motion and time study. New 
York: Wiley, 1949. 

2. Barnes, R. M., & Mundel, M. E. A study of 

hand motion used in small assembly work. 

Univer. Iowa Stud. Engr. Bull., 16, 1939. 

3. Ghiselli, E. E., & Brown, G. W. Personnel and 
industrial psychology. New York: McGraw- 
Hill, 1948. 

4. Hall, N. B., Jr. 

motions within a task. 
40, 91-95. 

. Hecker, D. G, & Smith, K. U. Dimensional 
analysis of motion: X. Experimental evalua- 
tion of a time-study problem. J. appl. Psy- 
chol., 1956, 40, 220-227. 

6. Hoel, P. G. Introduction to mathematical sta- 

tistics. New York: Wiley, 1947. 

7, Maynard, H. B. Stegmerten, G. J., & Schwab, 
J.L. Methods-time measurement. New York: 
McGraw-Hill, 1948. 

8. Nadler, H., & Denhoim, D. H. Therblig rela- 
tionships: I. Added cycle work and context 
therblig effects. J- industr. Engr, 1955, 6, 


344. 
9, Smith, K. U., & Harris, 


Internal relations of elemental 
J. appl. Psychol, 1956, 


n 


S; Je Dimensional analy- 
sis of motion: VII. Extent and direction of 
manipulative movements as factors in defin- 
ing motions. J. appl. Psychol., 1954, 38, 126- 


130. 

10. Smith, K. U. 
analysis of motion: V: 
ments of assembly motions. 
chol., 1953, 37, 308-314. 

11. Stiling, D. R. A study of the additive proper- 
ties of motion element times. Proc. Sth Annu. 
Industr. Eng. Inst. Berkeley: Univer. Cali- 


fornia, 1953. 


& Smathers, R. Dimensional 
I. The component move- 
J. appl. Psy- 


Applied Psychology 
Seana a 1958 


Field Training Versus Technical School Training for 
Mechanics Maintaining a New Weapon System ' 


Chester J. Judy 


Personnel Laboratory, Wright Air Development Center 


In the United States Air Force and its 
predecessor organizations a varying amount 
of emphasis has been given, from time to 
time, to centrally located and organized pro- 
grams of training for maintenance personnel. 
In the Army Air Corps before World War Il, 
for example, relatively few personnel received 
training at separate technical schools. Dur- 
ing World War II and for a time there 
however, the vast majority of those who be- 
came responsible for day-to-day maintenance 
of air vehicles were individuals who had re- 
ceived such training in some early portion of 
their military service, Presently, fewer air- 
men, once again, are being given complete 
training at separate technical schools. In- 
stead, somewhat greater emphasis js being 
placed upon what is generally termed “field 
training.” 

In field training, technical instruction cover- 
ing specified pieces of Air Force equipment is 
made available, through mobile training units, 
at strategic, defense, tactical, or other bases 
where “live” equipment is on hand, One ad- 
vantage of such instruction over instruction 

i Parately organized and located 

i ools is that airmen can learn on 


aining is accomplished on 
time basis a 


after, 


nlarged scale is per- 
haps especially appropriate, 


1 The research reported in this 
sored by the Personnel Laboratory, 
velopment Center, Air Research a 
Command, under Project No. 7950, 


Problem 


A crucial issue, in most instances when a 
choice of training must be exercised, is whether 
or not alternative plans provide equal bes 
tunity for learning on the part of specifiec 
groups of trainees. The present investigation 
was accomplished in order to answer two gen- 
eral questions concerning the training of me- 
chanics for the maintenance of one important 
new weapon system: TA 

1. Is there a significant difference, on 
whole, in job knowledge of B-52 aircraft me- 
chanics who have received field training EK 
compared with those who have completed 4 
more formal technical school course? : 

2. Is there a significant difference, at per 
ticular levels of mechanical aptitude g 
maintenance experience, in job knowledge a 
the part of airmen who have been exposed 
the two kinds of training environment? " 

In one situation mechanics received main 
tenance training on the B-52 aircraft in Be 
dence at an Air Force technical school. T 
training was given over a period of two mon’ A 
on a full-time basis. In the other situatio 
mechanics received maintenance training E 
the B-52 aircraft through mobile train! ‘i 
units on duty at operational sites. The dur 


ie 
tion of this training was also two months, bUY 


the trainees spent one half of each day on "i 
job. The net gain, from an administran 
point of view, was about 20 man-days ing 
every individual who received field traini 
rather than technical school training. ked 

Relative to both of the questions er 
above, an hypothesis of no difference Y 
adopted since the two courses presuma ce 
covered the same subject matter and XG 
the total exposure to B-52 maintenance me 
the classroom or on the job) was the sa 
for the two groups. 


384 


wt 


& 


Se 


Field Training vs. Technical School Training 


Procedure 


. The statistical design adopted for this study 
is based upon the use of the Johnson-Neyman 
Technique. In computation and plotting op- 
erations actually performed, however, for- 
mulas and procedures proposed by Walker and 
Lev (4) were used because they permit some- 
what easier computation than those developed 
by Johnson and Neyman (3). In either pro- 
cedure the basis of comparison between cate- 
gorical groups (in this study the group com- 
prised of individuals who had received field 
training versus the group comprised of indi- 
viduals who had attended technical school) is 
matched regression estimates. Here a meas- 
ure of job knowledge was used as the criterion 
variable and measures or indications of me- 
chanical aptitude and B-52 aircraft mainte- 
Nance experience were used as control vari- 
ables, 

For the purposes of this study job knowl- 
edge was defined as performance on an ex- 
amination developed by Human Factors, In- 
Corporated (2), In this examination, which 
is similar to a number of others which have 
been constructed for the Air Force, separate 


Scores are obtainable for different areas of 


Knowledge in the maintenance of specific 
Peces of Air Force equipment. This par- 
icular examination is now being used rou- 
tinely by the Strategic Air Command to 
ascertain training needs of B-52 maintenance 
Personnel, 
k The Ss of this study were 184 airplane me- 
anics working at the “5” skill-level on the 
t ay aircraft in November of 1956 at the first 
Wo Air Force bases to be fully equipped with 
at airplane? All such mechanics available 
also day time duty were tested. They were 
ing given a short questionnaire covering train- 
tud and experience items. Mechanical Apti- 
caja deres derived from the Airman Classi- 
~on Battery (1) administered during basic 


$ . 
For among aircraft maintenance personnel in the Air 
Prentiogt SPecialty code of 43131 identifies an aP 
iden aac OF semi-skilled mechanic, 2 code of 43151 
identigcs a skilled mechanic, and @ code of 43171 
skil P a maintenance technician at the highest 
are neS The Ss of this study were 43151's an 
YA ski’ to here as “mechanics working at the 
mech level” and, at other places in this report, as 


echan; s 
anics at an intermediate level of skill.” 


385 


training were obtained from personnel rec- 
ords at the respective bases. 

Of the 345 mechanics originally tested, 61 
were rejected as potential Ss of this study be- 
cause they had received neither field training 
nor technical school training in the mainte- 
nance of B-52 aircraft. An additional 18 
cases were rejected by reason of incomplete 
data. Airman Classification Battery scores 
were not available for eight potential Ss and 
10 men did not complete some portion of the 
job knowledge test or some portion of the 
questionnaire. Of the remaining 266 me- 
chanics, 174 had received field training and 
92 had received technical school training. 

In the final selection of cases an attempt 
was made to secure a greater degree of homo- 
geneity between the field-trained and school- 
trained groups by individual-to-individual 
matching on prior aircraft maintenance ex- 
perience. It was felt that previous mainte- 
nance experience might easily contribute to 
the job knowledge of B-52 aircraft main- 
tenance personnel, and since there was no 
statistical control programmed to take ac- 
count of this variable, “months on other air- 
craft” was used as a basis for matching. As 
a result of this procedure an additional 82 
In instances when more than 
one mechanic could be matched with a par- 
ticular man to form a pair, a table of random 
numbers was used in making the final choice. 
The 184 Ss selected for this investigation were 
comprised, then, of 92 B-52 airplane me- 

ntermediate level of skill who 


chanics at an i 
had received field training in B-52 mainte- 


nance matched with 92 B-52 airplane me- 
chanics at an intermediate level of skill who 
had received technical school training in B-52 


maintenance. 


cases were lost. 


Results 


ata for the comparison of the 
B-52 mechanics are presented 
lowing equations, obtained 
the manner outlined by 
p. 406-10), are plotted 


The basic d 
two groups of 
in Table I. The folli 
from these data in 
Walker and Lev (4, P 
in Fig. 1: 

8.0671X + .6718Z — 54.0951 =0 [1] 


26.00X? + 10.6528XZ — 2.74132" 
— 417.6X — 8.58Z + 1203 =0. [2] 


386 Chester J. Judy 
Table 1 
Basic Data for Comparison of Two Groups of B-52 Airplane Mechanics on Job Knowledge 
Field Trained Technical School 
Datum Mechanics Trained Mechanics 
N, = 92 N: = 92 
Number of cases M E 
Sum, scores on job knowledge test ZY; = 11,054 11,402 
Sum, months B-52 experience zZ, = 869 e 
Sum, mechanical aptitude index ZX, = 524 F 54: a 
Mean, job-knowledge test score Yı = 120.1522 123.9. 
Mean, months B-52 experience Zi = 9.4456 10.3043 
Mean, mechanical aptitude index X, = 5.6956 3 5.8913 
2=(Y-Y¥)? Cyy1 = 113,643.869 Cyye = 118,879.6 
=(Z—Z) Canı = 2588.7283 Cre = 2401.4783 
2(X-X}: Cxi = 239.4783 Cex» = 176.9131 
z(Y-Y)(Z-2) Cy = 4949.7609 Cyz = 3760.8261 
=(Y—Y)(X—X) Cyr = 2622.2609 Cyse = 661.3479 
2(Z—Z)(X—X) Czn = —62.5217 Cees = 33.0435 


nonsignificance. It is represented by the 
straight line in Fig, 


points at which the estimate of difference in 


Months B-52 Experience 


Zs 


; 1 
Equation [1] is the equation for the line of test performance for the two groups is equa 


to zero. On the right hand side of the as 
nonsignificance test performance is favora á 
to field-trained mechanics. On the left han 


1, and is the locus of 


x Field Trained Mechanic 


© Technical School Trained Mechanic 
O cA 


yun of non-signiticance Regen 
of 
Significance 
a=.05 
Field Training 
Better 


o 


Region of Signiticonce 


@=.05 


Technical School Training 
Better 


X = Mechanical Aptitude index 
Comparison of two groups of mechanics o 
covering the 


Fic. 1. 


n a job-knowledge examination 
B-52 airplane, 


n 


_ | 
et 
D ee 


Field Training vs. Technical School Training 


side of that line test performance is favorable 
to technical-school-trained personnel. Equa- 
tion [2] has been plotted in Fig. 1 to show 
curves which define the limits to a range of 
values for X and Z where performance on the 
job-knowledge test is significantly different 
(at the 5% level) for mechanics in the two 
groups. 

The point at which the variance of the ex- 
Pression 8.0671X + .6718Z — 54.0951 attains 
its minimum value represents the center of 
accuracy (CA) for the plot given in Fig. 1. 
It is the point where the observed difference 
is most reliable, and since the center of ac- 
curacy in this instance lies outside both parts 
of the region of significance, the best estimate 
of the true difference between mean perform- 
ance of the groups being compared is zero. 
That the point is slightly to the left of the line 
of Nonsignificance is an indication that there 
1s a slight difference, though nonsignificant, in 
Over-all test performance in favor of the tech- 
nical school trained group. By referring to 
the basic data in Table I it can be seen that 
Mean test score for the field-trained mechanics 
Was 120.15 while for the technical-school- 
trained mechanics this value was 123.93. The 
Similarity of the two groups in terms of mean 
aptitude index (5.70 versus 5.89) and mean 
experience on the B-52 aircraft (9.4 months 
versus 10.3 months) will also be noted. 


Discussion 


. The results of this investigation seem to 
Justify, to some extent, the greater emphasis 
Presently being placed on field training for 
~ chanics responsible for the maintenance of 
i € important new weapon system. If there 
S, in fact, no difference in the amount of job 
rem ledge acquired by personnel who have 
of eived maintenance training in either one 
en two generally different kinds of training 
ate nment, then the choice between the 
aoe schedules can be more easily made 
hae basis of administrative convenients, 
bresen » or other considerations. n =i 
telai instance field training is seen t° e 
vely economical in terms of manpower 
Who} Money since the same results, on the 
€, are obtained in less training time. 
t particular levels of mechanical aptitude 


387 


and maintenance experience, however, the re- 
sults of this investigation imply that field 
training should be reserved for higher aptitude 
airmen with some maintenance experience, and 
that technical school training should be more 
often scheduled for lower aptitude airmen who 
have had little or no actual work experience. 
The last part of this statement may at first 
seem to conflict with the known circumstance 
that a criterion of minimum mechanical apti- 
tude is a useful one on which to screen candi- 
dates for technical school training in the Air 
Force. In Fig. 1, also, the location of the 
left hand part of the region of significance 
might suggest that successful technical school 
training is associated with low aptitude and 
experience. All that is indicated on this 
matter by the results of this investigation, 
however, is that as long as two schedules of 
training such as the ones under immediate con- 
sideration are being retained, and as long as 
personnel at high and low levels of aptitude 
and experience become available for assign- 
ment to either of those schedules, then per- 
haps best over-all training results may be ob- 
tained whenever training assignments are ac- 
complished on a selective basis. 

Tt should be emphasized that the results of 
this investigation pertain to training for the 
maintenance of only one weapon system. 
Similar studies covering other important 
weapon systems must be carried out before 
wider generalizations concerning the relative 
utility of the kind of training schedules con- 
sidered here can be made. 


Summary and Conclusions 


In this investigation a comparison on job 
knowledge is made between mechanics who 
had received field training on an important 
new weapon system and mechanics who had 
received technical school training on the same 
With the effects of mechanical apti- 


system. 
perience controlled, 


tude and maintenance eX 
these two conclusions seem justified: 

1. On the whole, and in the particular kind 
of situation studied, there is no significant 
difference in job knowledge on the part of 
mechanics exposed to the two training environ- 


ments. 


388 


2. Mechanics at higher levels of aptitude 
and experience benefit most from field train- 
ing; mechanics at lower levels of aptitude and 
experience benefit most from technical school 
training. 

In connection with the second conclusion, 
ranges of aptitude and experience wherein 
field-trained personnel and technical-school- 
trained personnel differ significantly are speci- 
fied. Implications for present Air Force school 
assignment practices are discussed. 


Received February 25, 1958. 


References 


1. Brokaw, L. D., & Burgess, G. 


G. Development of 
Airman Classification Bat 


tery AC-2A. Lack- 


Chester J. Judy 


land Air Force Base, Tex.: USAF Personnel 
Train. Res. Cent., June 1957. (Tech. Rep. 
AFPTRC-TR-57-1; ASTIA Document No. 
131422.) 

2. Buckner, D. M. Construction of a proficiency 
examination for maintenance personnel on a 
new weapon system. Lackland AFB, Texas: 
USAF Personnel Train. Res, Cent., August, 
1956. (Developm. Rep, AFPTRC-TN-56-105 ; 
ASTIA Document No, 098880.) 

3. Johnson, P. O., & Neyman, J. Linear hypotheses 
and their application to some educational 
problems. In J. Neyman and E., S. Pearson 
(Eds.), Statistical Research Memoirs, Vol. 1. 
Cambridge: University Press, 1936. Pp. 57-93. 
57-93. 

4. Walker, H. M., & Lev, J. 


Statistical inference. 
New York: Holt, 1953, 


£ 


Ba -5 


Journal of Applied Psy y 
Vol. 42, hate ied Psycholory 


The Legibility of “Scotchlite” Versus Other Materials * 


Anthony Debons 


Rome Air Development Center 


and Clarke W. Crannell 


Miami University 


“that material which has the property of re- 
flecting incident light, from a single source, 
in a relatively narrow cone back toward the 
source” (5). When employed on road signs 
and similar displays, this material is designed 
z to provide greater light reflection over large 


| areas, without the “glare” induced by metal 


Reflex-reflective material (‘“Scotchlite”) is 


surfaces of similar reflecting power. 

The purpose of the present study is to 
evaluate the use of reflex-reflective material 
to improve legibility of digits viewed from sev- 
eral different angles and distances at night, 
Primarily with regard to their potential use 
as aircraft markings. Cannon (2) has con- 
ducted several tests of this material which in- 
dicate that it is sufficiently durable to be used 
for aircraft and ground markings. However, 
if reflex-reflective material could be shown to 
Provide legibility superior to that of other 
display materials and surfaces, the range of 
Practical applications would extend beyond 
the immediate purpose of the present investi- 
ation. 


l 3 Method 


pegebaratus. This study consists of two ground ex- 
op ments, These experiment: 
en field over 1000 ft. in length. 

i he test displays were made up 
on VY Design) digits (see Fig. 1), 
Ter gogua placards, 10 X 16 in. 
igure-ground display configurations we 


s were conducted in an 


of AND (Army- 
8X 10 in., placed 
in, (see Fig. 2). 
re used: 


l. Black digits painted on a surface of opaque 

a paint; 

flex. Black digits painted on a surfa 
reflective sheeting, #2270; 
+ Black digits on a surface 0 


ce of silver re- 
f uncoated aluminum; 


1 ` x r 
Bare his research was supported by the United Air 
tored. under Contract AF 33(616)-2544 and_moni- 
Pat by the Wright Air Development Center, Wright- 
€rson Air Force Base, Ohio. The senior author 
at that time Project Scientist at the Wright Air 


elopment Center, and in charge of the project. 


Was 
SS 


4. Digits of reflex-reflective material superimposed 
on a black painted surface.* 


All numerals from 0 to 9 were presented among the 
40 placards used. 

In a given presentation, five placards having 
identical grounds were mounted side by side on a 
support which could be rotated about a vertical axis. 
The center of each placard was 48 in. above the 
ground. The display appeared to the Ss as a series 
of five digits on a plane surface, presented at sev- 
eral angles of obliqueness during the different trials. 
A 5-digit display was chosen because the immediate 
memory span for 5 digits is only slightly less than 
perfect (8). Using all the digits in a single presenta- 
tion could have resulted in errors of memory up to 
80%. 

A green fixation light } in. in diameter was 
mounted on a support and placed just below the 
center placard. Illumination for the display was 
obtained from a standard Air Force spotlight ® ener- 
gized by two series-connected 12-volt batteries kept 
at maximum charge through recharging before each 
experimental session. The spotlight was placed at 
the Ss’ position mounted on a rigid support and di- 
rected toward the placards. A reading lamp just be- 
hind the spotlight provided the Ss with the illumi- 
nation needed for recording their responses. 

Procedure. In Experiment I black digits were 
used against reflex-reflective material, white paint, 
or aluminum. Data were collected at viewing dis- 
tances of 144, 218, 330, and 500 ft. and at each dis- 
tance for viewing angles of 90, 60, 40, 27, and 18 
degrees. Viewing angle is the angle made by the line 
of sight and the surface viewed. Thus, when the 
line of sight is normal to the surface, it is being 
viewed at a 90° viewing angle. 

In Experiment II a comparison was made between 
digits made of reflex-reflective material placed against 
a black painted background, and black digits against 
a reflex-reflective background. Only two distances 
were used: 250 and 500 it. Examination of results 
of Experiment I showed that distances of 218 ft. or 
less afforded so many perfect scores that little clear 
discrimination of differences in legibility of mate- 
rials was obtainable. 


The procedure used for collection of data was 


2 Specification for black and white paints used is 


Federal Mil-1-7178. 
3 General Utility Lamp, Flash, Signal, Mazda 4501, 


5. 3A 26V. 


389 


similar for the two experiments. 
in two rows, one behind the ot! 
The midpoint of § group’ 
reference point for distance mea: 
of seated Ss was approximately 
height above the ground. 


light. 


ny 


Anthony Debons and Clarke W. Crannell 


Fic. 1. 


The Ss were seated 
her, near the spot- 
ing was used as the 
surement. Eye level 
the same as target 


The E provided each 
read these instructions: 


to identify aircraft, 
above that little green 


out. 


nd 
S with an answer sheet a 


sed 
This is a study of the legibility of numerals U: 


ot 
Over there, about m eae 
light will be a panel o 


a 


Legibility of 


Table 1 


Relative Luminance for Various Materials and for 
Various Viewing Angles* 


Materials Used 


Viewing Reflex- White Alumi- Black 
Angle Reflective Paint num Paint 
90° 454.55 8.79 18.94 1.44 
85° 40.15 1.00 3.64 23 
60° 31.82 82 27 19 
40° 17.73 62 20 A7 
27° 9.24 AT 19 14 
18° 3.94 30 17 14 


anghen viewing is normal to the target surface, the viewing 
ane e is 90°, The white surface at 85° viewing angle was con- 
agred as the unit reference to avoid problems related to specu- 
ler reflection obtained from measurements made at a 90% vew: 

g angle. An 85° viewing angle was not used as an experi- 
mental condition. 


digits. When I say “Ready,” look steadily in that 
direction, I shall turn a spotlight on the panel 
for about 4 seconds. Your task is to read the 5 
digits. After 4 seconds I shall turn the spotlight 
off, and turn on the reading light on the post be- 
hind you. You are to write down the digits you 
saw in order in the proper boxes on your answer 
sheet. For example, I shall say, “This is number 
One. . . . Ready,” and turn on the spotlight. Sup- 
Pose the five digits you saw were two... three 
-.. four... five... zero. You would write 
cach of these in its proper little box after number 
one on your answer sheet. You will have plenty 
of time to write the five digits in their correct 
order while the men are changing to a new set 
of digits, 

The sets of 5 digits will b 
Materials, and will be shown, 
angles. You will be shown 30 suc! 
distance, and then we shall move to a 
distance, 

If you are not sure of any digit, you should 
Suess . | | unless you are just unable to make any- 
thing out at all. That is, if the digit could be 
either a three, or an eight, or 2 five, you should 
Write down which you think it is. But if it could 
e any digit from one to zero, draw à line through 

€ little box for that one digit. 

Also, please do not say them out loud, or com- 
Fite Your results with your neighbor. This is “ 
ay of the legibility of the digits, and not a tes 

Your special ability to read digits. 

ny questions? 
š This is number one. . . . Ready. (Expose for 4 
ends. Turn on the reading light for about 10 
aie: During that time say:) Write down the 5 
'gits in the same order in which they appeared. 


made of different 
also, from different 
h sets from this 
different 


ccessive trials 


The digi 
Igit di e set for the su 
by isplays were set fo ~enpport ap- 


the assistant Es located at the digit 


“Scotchlite” 391 


paratus. The spotlight was turned off during this 
period, rendering the digits indiscernible to the Ss. 
A red signal light mounted beside the display appa- 
ratus was flashed to inform the E located near the 
Ss that a new series of digits had been set. E then 
exposed the digits by spotlight for 4 sec. After this 
4-sec. exposure, the small reading lamp behind the 
Ss was lighted while they recorded their responses. 
Meanwhile, a new series of digits was set in place 
for the next trial. The procedure was followed 
until 30 exposures had been made at that distance. 
The Ss were then moved to the next distance and 
the procedure was repeated. 

The programing of the digits and the order of 
presentation of the test background materials were 
determined by the use of a table of random num- 
bers with the restriction that each digit must be ex- 
posed for each viewing angle at each distance for 
each material. The angular orientations were pre- 
sented in orderly sequence from 90 to 18 degrees in 
a counterbalanced order. 

Subjects. Thirteen male college students of ap- 
proximately 20 years of age were used as Ss. Each 


Table 2 


Experiment I. Number of Correct Responses, 
All Digits Combined 


(Number of Ss = 13) 
(Number of digits per angle = 130) 


Number* Percentage 
Dis- = 
tance Ange R W A R WwW A 
144 90 129 125 128 99.2 96.2 98.5 
60 130 130 125 100.0 100.0 96.2 
40 128 128 128 98.5 98.5 98.5 
27 128 130 99 98.5 100.0 76.2 
18 122 113 64 93.8 86.9 49.2 
218 90 128 130 117 98.5 100.0 90.0 
60 126 130 127 96.9 100.0 97.7 
40 126 127 94 96.9 97.7 72.3 
27 117 117 60 90.0 90.0 46.2 
18 76 68 14 58.5 52.3 10.8 
330 90 129 109 108 99.2 83.8 83.1 
60 121 123 67 93.1 94.6 51.5 
40 116 107 16 89.2 82.3 12.3 
27 79 34 0 60.8 26.2 0.0 
18 oo r f 17.7 54 08 
500 90 96 26 34 73.8 20.0 26.2 
60 84 30 0 64.6 23.1 0.0 
40 4 8 0 33.8 62 00 
27 1 0 0 85 00 00 
18 2 & 0 15 0.0 0.0 


n this line refer to background materials 


tters i T 
= The three letters fective; W. white; A, aluminum, 


as follows: R, reflex-re! 


392 


S had both near and far visual acuity of 20-20 or 
better (Ortho-Rater test). Eleven of these 13 5 
used for Experiment I were used for Experiment II. 

Calibrations. Target luminance in the field was 
measured with a Luckiesh-Taylor photometer. The 
field measurements for the white surface at 60 view- 
ing angle are listed below for each of the viewing 
distances: 


Distance 144 218 330 500 
Ft. Lamberts 4 35 i27 409 


Table 1 gives these measurements as luminance ratios 
for each material at each viewing angle. In this 
table, the white surface viewed at 85° to the surface 
is shown as unit luminance. It was felt that a view- 
ing angle of 85° provided the best reference since it 
eliminated the specular reflection which occurred at 
the normal or 90° viewing angle. 


Results 


Table 2 presents the findings obtained from 
Experiment I. At distances of 330 and 500 
ft., where the height of the digits subtends a 
visual angle of 10 sec. and 7 sec. of arc re- 
spectively, identification of the digits on back- 
ground of reflex-reflective material is clearly 


Table 3 


Experiment I. Number of Correc! 
All Digits Combined 
(Number of Ss = 11) 

(Number of digits per angle = 220) 


t Responses, 


s Number Percentage 
Dis- 
tance Angle R® RD» R RD 
250 90 214 208 97.3 94.5 
60 220 220 100.0 100.0 
40 219 217 99.5 98.6 
27 190 208 86.4 94.5 
18 99 186 45.0 84.5 
500 90 173 198 78.6 90.0 
60 129 193 58.6 87.7 
40 72 165 32.7 75.0 
27 15 87 6.8 39.5 
18 1 17 0.5 7.7 


Combined Data. Experiments I and I 


500 90 269 76.9 
60 213 60.9 
40 116 33.1 
27 26 74 
18 3 0.9 


* R, reflex-reflective background, 
è RD, reflex-refiective digits on black background, 


Anthony Debons and Clarke W. Crannell 


Table 4 


Tests of Significance (£) of Mean Differences in 
Legibility of Materials with All Five 
Angles of Observation Combined 


Experiment I 


Distance R-W R-A W-A 
144 1.02 ES ad cor 
218 0.08 8.38** 10.12* 
330 6.13** 20.09** 10.62** 
500 Tis 9.48** 2.64* 

Experiment II 

Distance RD-R 
250 5.68** 

500 13.65** 


* Indicates £ beyond the 5% level of confidence. 
** Indicates £ beyond the 1% level of confidence. 


superior to identification made when digits 
were placed on aluminum or white painted 
backgrounds. At 144 ft, and 218 ft., where 
the height of the digits subtends 24 sec. and 
16 sec. of arc respectively, there is no evident 
superiority of reflex-reflective material over 
the white painted background. However, 4 
Clear superiority of reflex-reflective back- 
ground over aluminum background is evident 
in terms of percent correct responses at these 
shorter distances when the viewing angle is at 
or 18 degrees, b 

Table 3 summarizes the findings of Experi- 
ment II, and also shows the per cent of con 
rect responses represented in the combine 
data from Experiments I and II for all vier 
ing angles at the 500-ft. distance. In Er 
periment II no marked differences in per ce” 
of correct responses occur, at the 250-ft. be 
tance, for discriminations of R (black de 
on Scotchlite background) compared be 
those of RD (Scotchlite digits on black bac 
ground) except at the 18° viewing angle. He 
this angle the legibility of the Scotchlite dig! 
(RD) appears to be nearly twice that of 
black digits (R). At the 500-ft. distance, rf 
Superiority of RD (Scotchlite digits on ee 
background) over R (black digits on Scot¢ Z 
lite background) is apparent for all viet 
angles. This greater legibility of RD at Ke 
500-ft. distance was least pronounced for t 
normal or 90° viewing angle. 


Legibility of “Scotchlite” 


393 


pe Table 5 
| Differential Legibility of Digits at 500 Feet in Terms of Per Cent Correct Responses 
(Data for 90° and 40° viewing angles) 
Viewing Digits 
Angle Material 1 2 3 4 5 6 7 8 9 0 
90° A i54 154 308 23.1 00 154 46.2 77 154 92.3 
w i54 308 7.7 38.5 7.7 154 154 154 00 538 
R! 76.9 69.2 53.8 100.0 61.5 76.9 1000 61.5 53.8 84.6 
3 95.5 77.3 86.4 59.1 864 409 1000 864 81.8 72.7 
RD 95.5 100.0 90.9 100.0 95.5 95.5 95.5 364 95.5 100.0 
40° A 0.0 0.0 0.0 00 00 00 00 00 0.0 0.0 
W 0.0 0.0 7.7 38.5 15.4 0.0 0.0 0.0 0.0 0.0 
R? 77 385 38.5 462 462 77 538 00 30.8 69.2 
R? 36.4 31.8 18.2 50.0 13.6 13.6 S45 213 4.5 77.3 
$ RD S ps5 955° 864 B18 909 B18 227 636 173 

1 z, 

a fom Bearman Th 

In order to determine whether or not these tribute to the legibility of digits viewed at 

ifferences were significant, scores in terms of night from various distances and different 
number of correct responses at all five angles angles. The extent of the contribution of 
Of observation were obtained for each S. The Scotchlite materials to greater legibility was 
# test of significance for correlated measures a function of both the distance of the target 
Was applied to the mean number of correct and the viewing angle of the S. Digits made 
responses. The obtained ¢ ratios for the dif- of reflex-reflective materials placed on a black 
erences in the mean number of correct re- background were more readily discriminated, 
SPonses for different distances and between in general, than were digits of black paint on 
Materials are presented in Table 4. a reflex-reflective background. } 

The data on the differential legibility of Illumination was a pertinent factor in the 
the digits for the 90° and 40° viewing an- present study. The data were obtained un- 
Bles at a distance of 500 ft. are presented in der nighttime conditions, and the standard 
Table 5. At this distance, and for the 40° Air Force spotlight used for illuminating the 

¿= 4ngle of sight, digits made with reflex-reflec- digits provided target luminance from ap- 
tive sheeting (RD) were found to be superior proximately .05 mL to 47 mL, depending on 
to those made with black paint, except for the materials used and the distance of the 
the digit 8, which was slightly more legible spotlight from the digits. For a constant 
When presented in black against reflex-reflec- angular target subtense a decrease in lumi- 
tive background (R) and the digit 0, which nance may give rise to a decrease in reading 
Was about equall; legible under the two con- performance. Hence, a reduction in the lu- 
ditions (R) and (RD) For the 90° angle, minance of a target because of its increased 
S00-ft, distance, the Scotchlite digits were distance from source of illumination may re- 

, Eenera]] Teak J] cases except for sult in a decrease 1m the readability of the 
the qi y more legible in a I | 3 showed target, Moon and Spencer (6), Shlaer (7) 
Mot 7, and the digit 8. Digit ch- and others have shown that the steep por- 
lite ;8° Teversal in favor of black on Scotch- and Or the acuity versus luminance function 

ckground. NA the portion where decrease in lumi- 
; i nance causes appreciable decrement in per- 

The Discussion WEN formance) is in the range from .01 to 10 mL. 

œ dicate (Suits of the current experiments I- Phe variations in Juminances used in this 

that reflex-reflective materials com 


394 


study were primarily in this eines Lag 
the decrement in performance may reflec an 
only the decrease in angular subtense wit! 
increase in target distance but also the de- 
crease in target luminance with increase in 
target distance. To avoid a performance 
decrement because of low luminance level, 
the illuminating source should provide for a 
target luminance of approximately 1 ft. L. 
If the target is made of reflex-reflective ma- 
terial, the illuminating source need be only 
approximately 1/40th of that required for 
flat white paint. 

Contrast is also an important factor affect- 
ing acuity. Cobb and Moss (3) have shown 
that visual acuity increases as the contrast 
between the object and its background in- 
creases. In the present study the various 
backgrounds provided different contrasts. For 
aluminum at 90° incidence and for reflex- 
reflective materials at all angles of incidence 
the contrast ratios were approximately 95% 


or better. For aluminum at the oblique an- 
gles, the contrast ratios were in the order of 
25%, 


and for white paint the contrast ratios 
were in the order of 75% (except at the most 
oblique angle, at which a contrast ratio of 
55% was obtained). The functions reported 
by Cobb and Moss (3) show that for con- 
trast ratios greater than 75% not much 
change in acuity results from increases in 
the ratio. Thus, except for the oblique an- 
gles for aluminum and the extreme oblique 
angle for white paint, contrast would not 


seem to be a factor affecting performance in 
this study. 


Studies on visual acuit 
ability (1) suggest that factors other than 
the use of reflex-reflective material are im- 
portant in obtaining the optimum legibility 
of digits. For example, doubling the size of 
the digit doubles the distance at which it can 
be read with equal clarity. However, in 
everyday usage, the optimal size of digits is 
limited by the space in which they must be 
placed. This suggests a need for re-evalua- 
tion of the width-to-height ratio of digits 
used where lateral space is unlimited but 
vertical space is limited, and vice ver 

It is to be noted from Ta’ 
bility of the digit 8 made o 


y and numeral read- 


sa. 
ble 5 that legi- 
f Scotchlite ma- 


Anthony Debons and Clarke W. Crannell 


terial viewed against a black background un < ; 
inferior to the same digit made of Hari 
paint viewed against a Scotchlite backgroun : 
Berger (1) found that for white digits par 
black background a stroke-width to heig 
ratio of 1 to 13 for the respective digits vee 
optimal while for black digits against a whi A 
background a stroke-width to height ratio = 

1 to 8 was optimal. The 1 to 8 ratio ie 
used for the digits in the current study. The , 
reversal noted for digit 8 in Table 5 S 
gests the desirability of investigating the p 
sible improvement of legibility with em 
lite digits having a stroke-width to heig 
ratio of less than 1 to 8. 


terials was investigated. The target placard 
carried digits similar to those painted nile 
present-day aircraft. The study heal A 
ducted under nighttime conditions wit ane 
standard Air Force spotlight used for illu 
nation of digits. stu- 
In Experiments I and IT male college pe 
dents, with normal near and distance bg or 
read sets of 5 digits which were expose rds 
4 seconds per set. The digit-bearing ar 
Were presented at viewing angles of 90°, 
40°, 27°, 18°, E, 
In Experiment I all digits were black e 
were placed against three different ee 
grounds: (a) reflex-reflective (Scotchlitea? 
(b) white paint; (c) aluminum. The bes ft. 
ing distances were 144, 218, 330, and 5 re- 
In this experiment the superiority of us 
flex-reflective background was demon ae 
at extreme viewing angles for all dista - 
The superiority of reflex-reflective x 
ground was also shown for all viewing 
gles at the 330- and 500-ft. distances. 
In Experiment II the legibility of ace 
made of reflex-reflective material P e 
against a black background was ore re- 
with that of black digits placed against plite 
flex-reflective background. The Seot p 
digits superimposed on a black backg" 
were found to afford superior legibility 
extreme angles for the 250-ft. distance 


the 
at all angles for the 500-ft. distance. At | 


É 
Summary 
The legibility of Scotchlite vs. other ma- 


igit 


n PNO 
N, 


sa 


Legibility of “Scotchlite” 


500-ft. distance the greater legibility of the 
individual Scotchlite digits was demonstrated 
at the 40° angle for all except the digits 8 
and 0. 


Received February 27, 1958. 


References 


1. Berger, C. Stroke width, form and horizontal 
spacing of numerals as determinants of the 
threshold of recognition. J. appl. Psychol., 
1944, 28, 336-346. 

2. Cannon, J. R. Service tests of “Scotchlite” re- 
flex-reflecting material. WADC Technical Note 
54-2, July 1954. 


395 


3. Cobb, F. W., & Moss, F. K. The four variables 
of the visual threshold. J. Franklin Inst., 


1927, 205, 831-847. 


4. Crook, M. N., & Baxter, F. S. The design of 
digits. WADC Technical Report 54-262, June 


1954. 


5. Minnesota Mining & Manufacturing Co. Reflec- 
tive characteristic of “Scotchlite.” St. Paul: 
Author, undated manual, circa 1943. 

6. Moon, P., & Spencer, D. E. Visual data applied 


to lighting design. 
1944, 44, 605-617. 


J. opt. Soc. America, 


7. Shlaer, S. The relation between visual acuity and 
illumination. J. gen. Physiol, 1937, 21, 165- 


188. 


$. Woodworth, R. S., & Schlosberg, H. Experi- 


mental psychology. 
Henry Holt, 1954. 


(Rev. ed.) New York: 


Applied Psychology 
ci Ai eis Ct 1958 


Studies in Management Training Evaluation: I. Scaling 
Responses to Human Relations Training Cases ' 


C. H. Lawshe, Robert A. 


Bolda, and R. L. Brune 


Occupational Research Center, Purdue University 


A series of studies has been undertaken in 
the Occupational Research Center to evaluate 
certain techniques popularly employed in hu- 
man relations training. In order to determine 
pre- and posttraining performance levels of 
subject groups, the authors decided upon using 
a standard stimulus device analogous to a 
work sample in human relations, A work 
sample was required which would elicit re- 
sponses related to effective handling of social 
interaction situations. 

The stimuli were three commercially avail- 
able sound-slide film cases selected from the 
McGraw-Hill Supervisory Problems in the 
Plant Series. These were: (a) Case of the 
Reddened Eyes, (b) Case of the Reluctant 
Electrician, and (c) Case of Ben’s Problem 
Workers. Each of these cases presents the 
development of a human problem situation 
involving a foreman and one or more em- 
ployees and ends at a point where supervisory 
action is required to relieve the situation, 

Two dimensions of primary interest were 
conceptualized. The first was called Em- 
ployee-Orientation—the extent to which an 
S’s proposed course of supervisory action re- 
flects a cognizance of the human problem in 
the case. The second dimension, Sensitivity, 
was defined as the ability to use the informa- 
tion in the film to explain the employee’s be- 
havior. It is the purpose of this article to 
describe a scaling Procedure by means of 
which open-end responses to stimulus films can 
be reliably scored. 


Procedure 


Several groups of academic and industrial Ss were 
shown the three films and wrote their responses to 
the following questions: 


1. If you were the foreman in th 
would you do now? 


2. Why did the employee behave the way he (she) 
did? 


is case, what 


1 This research is supported by a grant from the 
Foundation for Research on Human Behavior, 


396 


Responses to the first question were scaled on an 
Employee-Orientation continuum; responses to the 
second were rated on Sensitivity, 

Sixteen judges who were familiar with the cases 
assigned the responses among nine categories of a 
forced distribution. Because of the time required 
for rating, each response was rated by eight judges. 
The ratings instructions were: 

Orientation Scaling: “... A high employee-orienta- 
tion response is one which reflects cognizance of the 
human problem described in the film. This cog- 
nizance, as reflected in the course of action selected, 
is what we want you to rate. Low employce-orienta- 
tion may be evidenced in task-oriented responses, OY 
in answers which tend to avoid the human problem 
presented.” 

Sensitivity Scaling: “. . . scale these responses along 
a continuum of sensitivity to the employee's feel- 
ings. A ‘high’ response would be one which reflects 
the subjects ability to use subtle social cues presented 
in the film to explain the employee's behavior. | A 
‘poor’ response would reflect complete insensitivity, 
oF unwarranted value judgments,” 


Analysis 


A summary of the analysis of the rating 
task is presented in Table 1. The authors 
conclude that only the responses to the Case 
of the Reddened Eyes were sufficiently dis- 
criminable to justify its use as a research in- 


strument. The correlation between Orienta- 
tion and Sensitivity scores on this case was 
-56 which indicates that the dimensions are 


not independent, 


Extension to Scaling Subsequent Responses 


The research outlined above described the 
scaling of responses obtained in several initial 
Subject groups. In order to make use of this 
information in scaling responses obtained in 
subsequent groups, a second scaling study 
Was carried out on responses to the Case of the 
Reddened Eyes. Thirty-seven new responses 
to each question on this case were obtained, 


and it was desired to attach scale values to 
them. 


Master scale. 


As an initial step in scaling 
these new items, 


two master scales were con- 


at 


Studies in Management Training Evaluation 


397 


Table 1 


Means, Variances and Average Rater Intercorrelations of Response Scale Scores 


Reddened Reluctant Ben’. 
Eyes Electrician E 
Orient. Sens. Orient. Sens. Orient. Sens. 
Scale Scale Scale Scale Scale Scale 
Mean 4.99 5.01 4.99 4.99 5.00 4.99 
Variance 6.65 576 640 2.52 449 3.38 
r 67 63 as 8 A6 35 


structed from scaled responses to the questions 
for the Case of the Reddened Eyes. Ten and 
13 “bench-mark” responses were selected from 
each of the two sets of previously scaled re- 
Sponses. The criterion of acceptability for a 
bench-mark” response was arbitrarily set in 
terms of the range of category scores as- 
Signed to the response. Any response on 
which the range of category judgments was 
two or less was selected as a “pench-mark”; 
that is, only those responses were considered 
On which the eight judges exhibited a high 
degree of agreement. Displays were con- 
Structed, showing a vertical scale graduated 
in category scores, with the “bench-mark” 
responses keyed into the vertical scale at ap- 
Propriate points. 
mei method. Four 
deh sets of responses to t 
in ed Eyes by assigning sca l 
8 to their judgments of the new response s 
Position on the master scale (i.e. relative to 
the bench-mark items). ‘The judging instruc- 
tions were identical to those described previ- 
ously. Both “What” and “Why” responses 
Were scaled in this manner. Judge agreements 
On these tasks are shown in Table 2. 
Adequacy of ratings. 10 order to check on 
€ correspondence of scale values obtained in 
his abbreviated method with those obtained 
Y the forced-sort procedure, 12 responses 
to each question were randomly picked from 
the Previously scaled responses to the Case of 
e Reddened Eyes. The four judges slotted 
ese responses into the master scales, and 
i scores were obtained by averaging the 
Our new judgments on each response. Rater 
agreements on these tasks are shown in Table 


judges rated the two 
he Case of the Red- 
le scores accord- 


2 along with the correlations between the new 
scale scores and those assigned in the forced- 
sort procedure. 

These data indicate that an adequate scal- 
ing job can be done by a smaller group of 
judges using the master scaling scheme. Both 
interjudge agreements and correlations with 
the initial scale values substantiate this con- 
tention. It was noted that in the abbreviated 
rating approach, the judges tended to build 
a constant error into the judgments; on the 
average, the response scores obtained by the 
“keying-in” method tended to be too high by 
approximately 3 scale category. This distor- 
tion can be eliminated by a transformation of 
the scale and is of no consequence unless com- 
parisons are made between new responses and 
those scaled by the forced-sort procedure. 
Such a comparison is not anticipated. 


Table 2 


Correlational Results of Abbreviated Scaling Procedure 
(Master Scaling Scheme) 


Case of the Reddened Eyes 


Orien- Sensi- 
tation tivity 
Scale Scale 
Correlations between scale values 
obtained by forced-sort and ab- 
pbreviated scale methods 99 95 
Average judge intercorrelations 
on abbreviated scaling of “old” 
items 89 83 
Average judge intercorrelations 
on abbreviated scaling of 37 
80 83 


“new” items 


398 


Summary and Conclusions 


In an attempt to develop a training evalua- 
tion device having applicability to several 
levels and types of management groups, stand- 
ard human relations training cases were se- 
lected to serve as work samples of human 
problem situations. These cases describe the 
development of a problem situation and re- 
quire the supervisory S to propose a course of 
action appropriate for the solution of the 
problem. Following presentation of a film, Ss 
were asked to indicate (a) what they would do 
if they were the foreman, and (b) why the 
employee behaved the way he did. Responses 
to the first question were scaled on an Em- 
ployee-Orientation continuum, reflecting the 
extent to which the course of action proposed 
indicated an awareness of the human problem. 
Responses to the second question were scaled 
on Sensitivity, the extent to which the sub- 
ject’s responses reflected the ability to use 
subtle, social cues presented in the film to 
explain the employee's behavior, 


A total of 16 judges participated in scaling 


C. H. Lawshe, R. A. Bolda, and R. L. Brune 


these six sets of responses, using a forced-dis- 
tribution scheme. The average rater inter- 
correlations for the six scaling tasks prompted 
the decision to eliminate the cases of the Re- 
luctant Electrician and Ben’s Problem Work- 
ers from further research. ; 

A second scaling study was described which 
was directed at scaling new responses to a 
case without duplicating the laborious forced- 
sort procedure. Master scales were con- 
structed of “bench-mark” responses on which 
judge agreement was high, and new responses 
were assigned scores according to their quality 
with reference to the “bench-mark” items. 
Judge agreements on this task were shown to 
be adequately high, and comparisons between 
scale values obtained by the abbreviated pro- 
cedure and those obtained in the forced-sort 
approach indicate that the master scale method 
can be utilized with confidence. 

Further information with respect to “re- 
test” reliabilities and scale validity will be 
presented in subsequent articles. 


Received March 3, 1958. 


~i 


a 


Journal oj Applied Psych J 
Vol. 42, eats d 


Matching Indices for Use in Forced-Choice 
Scale Construction 


Robert F. Morrison 


Iowa State College 


and Howard Maher 


University of Pennsylvania 


Investigation of the forced-choice method 
has shown a need for basic research and 
an integration of previous studies. In this 
method, statements are paired for equal ap- 
pearance but unequal discrimination. The 
Problem here investigated concerns various 
methods of equating statements for appear- 
ance. If items have several “appearances,” 
the problem of matching may be so great as 
to make forced-choice test construction ex- 
tremely difficult. 

Some work has already been done on this 
problem, Gordon (4) studied four methods 
of equating statements for appearance, u 
ing both favorable and unfavorable appearing 
Statements, while Edwards and Horst (1) 
and Highland and Berkshire (5) each used 
two methods. Wherry * utilized both posi- 
tive and negative items in a study of many 
appearance variables derived from a search 
of rating literature. 

This study is an att 


tiliz- 


empt to integrate the 
results of previous studies and to add to 
them. This is done by analyzing a compre- 
hensive list of appearance scales derived from 
both a search of the literature and from the 
insights of people making decisions in a 
forced-choice test situation. Only positively 
toned items are used. 

The basic question remains: Is more than 
One appearance index necessary in order to 


1 e ; 
quate forced-choice items? 


Procedure 


t The 100 items used in this study were 
aken by use of a table of random numbers 


f x 

Tom a total of 336 items developed in an 
MSi 

‘Whi i is investigation 

of erry, R. J. Information on his 1 
forced-choice. appearance indices 1M a personal 


munication to H. Maher, 1952- 


399 


Office of Naval Research project.” The ONR 
project started with an attempt to gain em- 
pirical evidence as to the importance of vari- 
ous personality dimensions. More than 500 
students were asked to name and give exam- 
ples of the five most admired and five most 
disliked personality characteristics. A panel 
of three judges categorized these statements 
into 12 areas representing general personality 
characteristics. Statements typical of each 
area were then used to form 12 50-word de- 
scriptions, one for each characteristic. A 
group of 300 students ranked the 12 descrip- 
tions in order of importance of the charac- 
teristics, as seen in their associates. A cate- 
gory named “Friendliness and Cooperation” 
was ranked as the top one. Consequently 
this was chosen as the category for study. 
The 336 cooperation items were accumulated 
for this category from four sources—the 
original group of statements, a new group of 
200 students who listed examples of “Friend- 
liness and Cooperation,” existing personality 
tests, and the test constructors themselves. 
In the initial step of the present investiga- 
tion a short “test” was constructed to be em- 
ployed in establishing “beating” hypotheses. 
The 100 items were reduced to 40 by further 
use of a table of random numbers. These 
were placed in 10 forced-choice blocks of four 
items each, the matching of items being per- 
formed by a three-man panel utilizing a find- 
ing by Ghiselli (3) in which a forced-choice 
scale was constructed by inspection alone. 
The “test” was administered to 20 fraternity 
men, each being asked to rank the four items 
in each block in the order which would give 


opinion: 

of the writers and are no 
cial or repre: 
ment or the Ni 


t to be construed as offi- 
s of the Navy Depart- 


400 


him the highest score in applying for fra- 
ternity admission. 

Once the testee had completed the test, he 
was asked to state what rationale was used 
in determining his ranking within each block. 
Each choice the interviewee made was fully 
discussed, the entire interview being tape re- 
corded. The interviews were transcribed and 
copies were distributed to a six-man panel of 
staff and graduate students working inde- 
pendently to form “beating” hypotheses. 

From the panel’s discussion, 10 general 
categories were formed which could be rated 
on a scale continuum, had been most fre- 
quently mentioned in the interviews, were be- 
lieved important in rating cooperation and 
friendliness statements, and were thought ap- 
plicable to forced-choice scale construction. 
Some were also backed by information de- 
rived from previous studies by Wherry (see 
Footnote 1) and Gordon (4). Finally five- 
point scales were constructed to represent the 
categories. The 10 scales, more completely 
described by Maher (6), were Group Cen- 
teredness, Basic Value, Restriction, Desir- 
ability, Clarity, Breadth of Coverage, Lead- 
ership, Practicality, Sincerity, and Activity. 
An example of the scaling used is: 

Breadth of Coverage 


5 This item is a very general and all-inclu- 
sive one. It covers several, more specific 
characteristics of people. 

4 This item is a somewhat general item. 

3 This item is really neither general nor 
specific. 

2 This item is a somewhat narr 
cific one. 

1 This item is extremely specific, describ- 


ing only a single, narrow characteristic 
of a person, 


ow and spe- 


An eleventh scale, Certaint 
was based on Wherry’s 
Factor II, named Certainty of Observation, 

Next, the 100 items were arranged into 
booklets, being presented, at this Stage, not 
in forced-choice form but as items to be 
judged singly on the five-point scales previ- 
ously mentioned. As a pilot study to deter- 
mine the homogeneity of fraternity and non- 
fraternity samples, the booklets were given 


y of Observation, 
(see Footnote 1) 


Robert F. Morrison and Howard Maher 


to a group of 11 fraternity and 24 nonfrater- 
nity men with directions for rating each item 
on a trial scale (Breadth of Coverage)— 
picked at random from the other 10. Since 
the product-moment coefficient between the 
item means was .76, the two groups were as- 
sumed to be comparable. This finding served 
as the basis for the method of data collection 
used in the next step in which both fraternity 
and nonfraternity subjects were combined. 
Copies of the 11 sets of previously described 
directions and the 100-item booklets were 
randomly distributed, so that each item was 
rated 36 times on each scale. Three hundred 
ninety-six students served as item raters. All 
papers having omissions or dual answers were 
eliminated and the number of raters reduced 
to 33 per scale by either the above elimina- 
tion or by randomization, Frequencies were 
then obtained for each item on each scale, 
and a mean scale value for each item was 
calculated. Using these means, the intercor- 
relations for the first 11 scales were calcu- 
lated and placed in the correlation matrix. 
Two additional scales were “borrowed” 
from the above-mentioned ONR project (6). 
In that study 1029 fraternity men had been 
ranked by their fraternity brothers. The in- 
trafraternity reliabilities of this ranking cri- 
terion ranged from .83 to .97 with only four 
below .90. The mean odd-even reliability 
corrected by the Spearman-Brown formula 
Was .93 for all fraternities. These men were 
then split by fraternities into three groups 
equated for reliability coefficients and sam- 
ple number, One of these groups was used 
to determine item Preference and Discrimi- 
nation Indices. The others were retained for 
the ONR forced-choice validation and cross- 
validation steps. For the Preference scale; 
150 raters ? judged each of the 100 items 0? 
how well it described him. The Preference 
Index for an item was the mean self-descrip- 
ne for all 150 raters. The Discrimination 
De fae: an item was the mean Preference 


T, 
To save time, howeve" 
He ee items pea been split into two equal pat fa 

ems each. Thes t al 
nately in e parts were given ou 


each fraternity 
each house, arto 


A 


Forced-Choice Scale Construction 


score on the item for the top one third of the 
group (as derived from the intrafraternity 
rankings) minus the mean Preference score 
on the item for the bottom one third of the 
group. Intercorrelations were found between 
these two scales and each of the previously 
described 11 scales to fill out the correlation 
matrix to its full 78 intercorrelations. Fi- 
nally, the matrix was factor analyzed to yield 
five factors. 


Results 


Because of its large number of negative in- 
tercorrelations with the other scales, the Re- 
striction scale was reversed to make for easier 
interpretation. It then became a Non-Re- 
stricted or Universal Behavior scale, and the 
signs of its intercorrelations were changed to 
make nine positive values and only three 
negative values. 

The matrix of scale intercorrelations 
(Table 1) was analyzed using Fruchter’s 
(2) procedure for the centroid method. Two 
criteria, Humphrey’s Rule (2, p. 79) and 
Tucker’s Phi tests (2, p. 77), were used for 
Stopping after the extraction of five factors. 
. The factors, rotated four times, are shown 
in Table 2. Rotation was carried to a com- 
Promise between psychological meaning and 
Simple structure. However, loadings were so 

eavy on unrotated Factor I that it was be- 


401 


lieved best to maintain it as a general one. 
This may also be supported by examination 
of the original matrix, the intercorrelations 
apparently being high enough to support the 
hypothesis of a general factor. Furthermore, 
in the light of Gordon’s (4) findings, the 
concept of a general factor would seem par- 
simonious. 

The rotated factors found in the study are: 

I. Social Desirability—failing to have ma- 
jor loadings‘ only on Group Centeredness 
and Breadth of Coverage, and containing 
59% of the total variance. 

II. Universal Behavior—composed of a 
high positive loading (.60) on Universal Be- 
havior and appreciable negative loadings on 
Leadership (— .50) and Preference (— 40). 

III. Undesirable Activity—having a posi- 
tive loading on Activity (.32) and negative 
loadings on Leadership (— .30) and Desir- 
ability (— .38). 

IV. Breadth of Coverage—containing load- 

ings only on Breadth of Coverage (.66) and 
Certainty of Observation (.47). This is the 
only high loading obtained for the former 
scale. 
V. Nonvalid Preference—containing only 
one appreciable loading, e.g., the Preference 
loading (.48), but the tendency for a negative 
“$A level of 30 was chosen as the criterion for in- 
clusion in a factor. 


Table 1 


ation Matrix (Upper Half of 


Table) and Matrix of Residuals After 


igi | 
aint ih Extraction of Five Factors (Lower Half) 

ae i 

— a en Fee TE ae 
SE : g =m —1i =08 181 (0% il =e io —01 -14 14 
B, E e 00 2 5 aan m 162 noe ROT aS 59 7 
©. Universa] Behavior 12 —02 48) Gh Zy —10. Ga 8) nes 53 07 . 
D. Desirability o3 00 —05 aa eee ASi a 
X, Clay o9 04 —04 —04 282 19 56, 38 R E 3 p 
E. Br 00 06 06 0 1 0 
si Bre; Si —08 —01 —06 
S a Coverage 9g —05 -01 -%4 -11 o “ 2 zno 2 28 
x Practicality o —02 03 o2 00 05 05 $ 67 A 7 a 
` Actiyi u 0 o z 
K, bene 00 y 3 -0 0 0 =0! —03 02 o e 36 z 
L. ainty of Observation —4 —0 ang o7 o7 -02 -03 = 5 

o5 02 —08 
M, pretence o4 —03 ot o6. 06 —02 04 —06 —03 —01 —08 
'Scrimination 06 o %2 = — 
ble, i.e results rounded to two significant figures, decimal point 
e table, 1+ 


N 


Omg ote seed from th 
tteq, —Two-place decimals have been omitted 


402 


Robert F. Morrison and Howard Maher 


Table 2 


Factor* Loadings> After Rotation 


I II Til IV ae 
Group Centeredness -11 —21 12 15 = 
Basic Value 18 —24 - 28 33 sent A 
Universal Behavior 59 60 27 33 e 
Desirability 81 —07 —38 34 t 
Clarity 55 28 —28 —29 —15 
Breadth of Coverage 06 —08 17 66 20 
Leadership 43 —50 —30 —02 = as 
Practicality 90 14 —09 04 —05 
Sincerity 83 —14 18 08 —03 
Activity 80 17 32 03 02 
Certainty of Observation 45 27 18 47 21 
Preference 80 —40 17 -11 48 
Discrimination 44 —21 25 03 —27 


à The factors a 


re: Social Desirability (I), Universal Behavior (II), Undesirable Activity (III), Breadth of Coverage (IV). 
and Nonvalid Preference (V), 


wo-place decimals have been omitted from the table, 


loading on Discrimination (or validity) should 
be noted. 


Discussion 


What interpretation can be obtained from 
the foregoing results? Factor I (here called 
Social Desirability) Supports the findings of 
Gordon (4) and Edwards and Horst (1), ice. 
regardless of what we call our scales the per- 
son reacts to the items in terms of general 
social desirability or appearance of the items. 
The loading on Preference (.80) indicates 
that the matching scale used since the in- 
ception of forced-choice scales is generally 
satisfactory, The finding is a fortunate one 
in terms of economy of scale construction, 
ie, investigators apparently can match on 
Preference and have some assurance that 
they are thus controlling on general appear- 
ance of the items. Note also that an ele- 


ment of the Discrimination Inde: 


xX rides along 
with the factor. As computed here, and in 


many other instances, the Discrimination 
Index and the Preference Index are not in- 
dependent of each other, However, the load- 
ing is small enough in Factor I so that it is 
still possible to pair for appearance and have 
discrimination different (as it would not be 
possible if 7 were 1.00, and difficult if 7 were 
appreciably high). Thus general appearance 
is not a giveaway to discrimination, With 


higher loadings here a testee or ratee might 
be able to detect the valid item by spotting 
the “better looking” item (where matching 
on appearance was not perfect). 

Factors II, III, IV, and V are much weaker 
than Factor I since their combined variance 
is still much less than the variance for that 
factor alone. However, they also are of pos- 
sible aid in the construction of forced-choice 
items. Factor II may indicate that some 
items can be so common they are to be con- 
sidered undesirable. Factor III has y 
limited loadings so its interpretation is n0 
clear, but Factor IV may indicate that items 
describing general behavior would give we 
testee or ratee more opportunity to obier 
the characteristic in himself. Factor V pa 
seem to be expressing items with popularity 
but with negative validity, “suppressor” item”: 
items with favorable appearance which, if €” 
dorsed, serve to lower the S’s score. f 
This study somewhat parallels the work F 
Wherry (see Footnote 1). His largest facto! 
Positive Emotional Tone, resembles Factor s 
Social Desirability. Failure of this study a 
Completely verify Wherry’s factors may }4 5 
arisen because of procedural differences SUC 
as different item sources, use of positive it® g 
only, and rotation differences, Follow” 
matrix appearance, our first factor, S0C!# 


4 


Forced-Choice Scale Construction 


Desirability, was kept high, and rotation was 
merely used in “cleaning up” the factors. 

Above all, however, the major finding of 
this study, e.g., the general factor, would in- 
dicate that forced-choice scale construction 
can remain a relatively simple procedure. 
Thus items may be matched on a social de- 
sirability index with some assurance that 
many other “appearances” will not also re- 
quire equating. 


Summary 


A study was made of 12 “appearance” 
scales for possible use in forced-choice scale 
construction. Both interviews and a review 
of the literature were sources of the scales. 
The interviews were conducted with frater- 
nity men who were asked what criteria they 
had used in attempting to “beat” a forced- 
choice test set up supposedly to screen fra- 
ternity applicants. The review of literature 
produced two scales, Preference and Cer- 
tainty of Observation, while the interview in- 
troduced seven more scales, Group Centered- 
hess, Activity, Leadership, Breadth of Cover- 
age, Restriction, Practicality, and Sincerity. 
Together the literature and interviews pro- 
duced the remaining three scales, Basic Value, 
Clarity, and Desirability. 

The Discrimination Index was then added 
to the 12 appearance scales. From mean 
item ratings, product-moment intercorrela- 
tions were calculated, and the correlation 


403 


matrix was factor analyzed. Rotation was 
minimal and served only to clear up the in- 
terpretation of the five factors obtained. The 
finding of a general factor, supported by 
previous studies, brings an element of econ- 
omy to forced-choice scale construction tend- 
ing to support the pairing of items on only 
one appearance index. Because of its high 
loading, the most commonly used index, 
Preference, seems to be justified in its usage. 


Received March 4, 1958. 


References 


1. Edwards, A. L., & Horst, P. Social desirability 
as a variable in two technique studies. Educ. 
psychol. Measmt, 1953, 13, 620-625. 

2. Fruchter, B. Introduction to factor analysis. 

New York: Van Nostrand, 1954. 

. Ghiselli, E. E. The forced-choice technique in 
self-description. Personnel Psychol. 1954, 7, 
201-208. 

4. Gordon, L. V. Some jnterrelationships among 
personality item characteristics. Educ. psy- 
chol. Measmt, 1953, 13, 264-272. 

5. Highland, R. W., & Berkshire, J. R. A meth- 
odological study of forced-choice performance 
rating. USAF Hum. Resour. Res. Cent, Res. 
Bull, 1951, No. 51-9. 

6. Maher, H. Construction of a forced-choice “co- 
operation” test and investigation of matching 

indices of potential value to forced-choice 
questionnaires. Ames, Iowa: Iowa State Col- 
lege, 1957. k 

. Uhrbrock, R. S. Standardization of 724 rating 
scale statements. Personn. Psychol, 1950, 3, 


285-316. 


al of Applied Psychology 
Para ko. 6, 1958 


Some Effects of Decision and Discussion on Coalescence, 
Change, and Effectiveness * 


D. F. Pennington, Jr.,? Francois Haravey, and Bernard M. Bass 


Louisiana State University 


This study examined some differential ef- 
fects of group decision, group discussion, and 
their interaction. Three hypotheses were 
tested: 


1. Group discussion promotes coalescence 
(or increased agreement), effectiveness, and 
change. 

2. Group decision, per se, does likewise. 

3. A combination of discussion and decision 
yields the greatest amount of coalescence, ef- 
fectiveness and change. The absence of both 
discussion and decision produces the least 
coalescence, effectiveness and change. 


The dependent variables were measured by 
the correlations between and within a series 
of true rank orders and rank order judgments 
by members before and after experimental 
treatment. These correlations and their 
change yielded the measures—coalescence, 
stability, and effectiveness, 

There were four treatments. 
only discussed the rankings, 
reached decisions without discussion, five 
groups did both while five others did neither, 

Aside from its practical significance, the 
problem has some theoretical import. A 
theory of leadership Proposed elsewhere (2) 
assumes that most change in a group faced 
with a problem occurs due to interaction; rela- 
tively little need be attributed to isolated 
problem solving unless special feedback condi- 
tions occur within the problem itself or the 
collection of individuals is irrelevant to need 
reduction by the individuals, Participation 
or interaction among members should produce 
change and the consequences 


of change— 
coalescence and effectiveness, Members with- 


Five groups 
five groups 


1 This study was assisted by Austin W, Flint work- 
ing under Contract N7ONR 35609, Group Psychol- 


out opportunity to interact should change very 
little in behavior. ? 3 

Secondly, members of a group will be ex 
pected to change in a direction they Nein 
is more rewarding. If members are aiel 
to the group, at least minimally, ne 
tend to accept group opinion as more likely 
to bring them reward than if they adhere to 
their own. But this cannot be done if the 
group’s opinion never is stated. Participation, 
per se, can give some clue to the group's ye 
ion, but a stated decision will provide the 
clarity promoting member change. 


Earlier Studies 


The effects on group behavior of seis | 
making and discussion have been pepe 
earlier, but many previous investigations hav 
been field studies with less experimental o 
trol possible than in the laboratory. Many 0 
the laboratory studies did not attempt tO 
Separate the effects of discussion from group 
decision-making, For example, as early Be 
1914, Munsterberg (14) reported that Da 
vidual’s judgments of the number of dots © s 
cards were more correct after peas i 4 
a group discussion. While Burtt (6) A 
found that discussion promoted change, ; J 
noted that average effectiveness was not a 
creased. Radke and Klisurich (15) observe 
that mothers of new-born infants who engai ia 
in discussion among themselves under t g 
leadership of a dietician, eventually roach 
group decisions coinciding with the a 
recommended procedures, adopted the desir ly 
behavioral patterns much more effective ae 
than a control group receiving individual 1 
struction. e- 

Anderson (1) described the presumed tora 
stalling by discussion of a strike of pe 
members in a Detroit factory. A meeting W 


isi $ n 
i itteeme 
The investigators wish to thank Donald J. Lewis for held of management and union comm were 
editorial assistance. during which the offended members 
2 Now at the University of Alabama. allowed to discuss their grievances openly. 
404 


oe 


at 


K 


Effects of Decision and Discussion 


Faced with resistance by pajama factory 
workers to changing methods, Coch and 
French (7) compared a control group of op- 
erators, taught the usual way, with two ex- 
Perimental groups of workers. The new 
method was accepted more readily in both 
experimental situations where the workers 
themselves or worker representatives were per- 
mitted to discuss and decide on the changes in 
method. Similarly, Levine and Butler (8) 
reduced the “halo” in merit ratings assigned 
by foremen significantly more than in control 
gtoups by permitting them to discuss and 
make decisions regarding more realistic evalua- 
tions. 

In a study designed to reveal the effects 
Of discussion, decision, commitment and con- 
Sensus on the willingness of students to par- 
ticipate in experiments, Bennett (5) found: 

1. Group discussion was no more effective 
than lecture or no influence at all in producing 
the desired action. 

2. More Ss volunteered from among groups 
required to make a decision than from those 
Who were not. 

_ 3. Public commitment was no more effective 
m producing the response than was private 
Commitment, 

4. A high degree of group consensus or 
agreement regarding the decision to volunteer 
Was more effective than was a low degree. 

Yet, Bennett suggested that many uncon- 
trolled factors, such as salience of subject 
Matter and variations in group cohesiveness, 
may have been operative in this field situa- 
tion, making generalization difficult. 

In the same vein, McKeachie (12) con- 
trasted 3 conditions: group discussions fol- 
lowed by decisions; lectures followed by the 
announcement of the results of a secret ballot 
and lecture with votes not announced. But, 
the effects of discussion in the absence of 
decision were not examined. Results indi- 
Cated that members shifted opinion in the 
direction they perceived the group as a whole 
Was changing. 


Method 


F Subjects. Twenty groups of five Ss were recruited 
rom classes of elementary psychology students at 
aes State University. Incentives were the ad- 
ition of one point to the course grade of each 


405 


volunteer, and monetary awards of $30 for the 
group with the highest final accuracy scores and $6 
for each of the next five most successful groups. 
Volunteers were assigned groups randomly within 
scheduling limitations. Groups were assigned treat- 
ments randomly. 

Materials and apparatus. Ten sets of names of 
five cities comprised the material to be ranked before 
and after treatment. The ‘“mark-sense-to-electronic 
calculator” method described elsewhere (4) was used 
to register the rankings. 

Procedure. For a given problem, Ss under all four 
treatments registered their initial private opinions 
about the rank order of size of population of five 
cities. They registered their opinions again follow- 
ing an intervening period of approximately 3 min. 
but each treatment involved different conditions dur- 
ing the intervening period. There were 10 such 
problems. Each successive problem involved the 
names of five new cities. Five groups of five Ss each 
underwent one of the four intervening conditions: 
discussion-decision; discussion-no decision; decision- 
no discussion; and no discussion-no decision. 

Discussion-decision groups discussed the rank order 
of cities for approximately 3 min. to reach a group 
decision announced by one of the members. Dis- 
cussion-no decision groups discussed the rankings 
without instruction to reach or announce a group 
decision. Decision-no discussion Ss were assigned 
the irrelevant Social Acquiescence Scale (3) for 2 
min., after which a group decision was obtained by 
secret ballot and announced. Ss under “no discus- 
sion-no decision” were assigned the irrelevant Social 
Acquiescence Scale for the full 3 min. between rank- 
ings. The correct rankings never were presented to 
any of the Ss. 


Analysis of Results 


The mark-sense cards on which Ss recorded 
their rankings for all 10 problems, following 
machine conversion and calculation, yielded 
the data on coalescence, effectiveness and 
change for statistical treatment using the group 
as the unit of analysis.? The analyses of 
variance calculated for each of the three de- 
pendent variables of coalescence, effectiveness, 
and change ignored the error variance within 
groups due to differences between trials and 
between Ss within the same group and ex- 
amined scores averaged for all 10 problems. 
The estimates of error were inflated by these 
within-group errors. 

Coalescence. Within a given group, for a 
given problem, coalescence was the difference 

3 i Ji 
Bits Bree oe LEO al the tans Bote Rone 


refinery on an IBM 650 using an instructi k 
scribed elsewhere (4). a 


406 


Table 1 


Mean Coalescence, Stability, and Effectiveness asa 
Ë Function of Discussion and/or Decision 


Mean 

No 

Dis- Dis- 

Measurement cussion cussion Both 

Decision 38 .23 30 
Coalescence No-Decision 30 —.01 14 
Both 34 ll 22 
Decision -70 84 17 
Stability No-Decision .76 92 84 
Both .73 88 81 
Decision .06 -06 -06 
Effectiveness No-Decision 10 —.01 -04 
Both .08 .03 .05 


in mean rho correlation between members’ 
initial rankings of five cities and the average 
correlation of their rankings of these same 
cities finally. The value was positive if mem- 
bers increased in agreement with each other; 
negative if they agreed less finally than ini- 
tially on the average problem. 

The 10 groups permitted discussion ex- 
hibited for all 10 problems a mean coalescence 
of .34 while the 10 groups denied discussion 
exhibited a significantly lower mean (at the 
5 per cent level) of .11 according to a ¢ test. 
Similarly, the 10 groups reaching announced 
decisions had a mean coalescence of .30 sig- 
nificantly higher (at the 5 per cent level) than 
the mean of .14 attained by groups reaching 
no decision. Greatest coalescence (.38) oc- 
curred under a combination of discussion and 
decision, while least coalescence (—.01) ap- 


D. F. Pennington, Jr., F. Haravey, and B. M. Bass 


peared when both were absent. However, ac- 
cording to ¢ tests the mean coalescence of 30 
for discussion alone was not significantly 
higher than the mean of .23 for decision alone. 
Yet, the combination was significantly more 
productive than either alone while either alone 
yielded significantly greater coalescence than 
a complete absence of both. The results are 
summarized in Table 1 while the appropriate 
analyses of variance are displayed in Table 2. 
The standard error of the difference between 
means and the error variance was estimated 
using the mean squares due to groups within 
treatments since the interaction among treat- 
ments was not significant. 

Presence or absence of discussion seemed to 
exhibit slightly more effect (.34 vs. .11) than 
presence or absence of decision (.30 vs. 14). 
This may have been due to the fact that s 
discussion groups not permitted amame 
group decisions, discussion itself gave partia 
information about the opinion of the group as 
a whole. On the other hand, the groups per- 
mitted decisions but no discussion had to ac- 
cept the experimenter’s summary of the secret 
ballot as true before allowing their opinions 
to be influenced. , 

Stability. On a single problem involving 
ranking five cities, the stability of an S'S 
opinion was indexed by the correlation of his 
initial and final ranking. As seen in Table 1, 
Ss changed their opinions, on the average, p 
the same degree as they coalesced in opinion 
under the different treatments (the higher the 
tabled correlation, the less change in a men 
ber’s decision). Discussion, per se, pede 
significantly more change at the 1 per we 
level of confidence, Decision did likewise 4 


Coalescence 


Discussion-No Discussion 
Decision-No Decision 
Interactions of Treatments 
Groups Within Treatments 


Total 


Stability Effectiveness 
33.3** 25.74" 3.4 
26.7** 5.0" 6 
3.9 2 3.0 


=s 


4 


i: 


| 
| 


Effects of Decision and Discussion 407 


the 5 per cent level. Again a combination 
yielded the greatest change (.70) while least 
occurred where both decision and discussion 
were absent (.92). 

Effectiveness. This was the extent the aver- 
age S improved in accuracy. Improvement 
was the gain in correlation of an S’s final 
ranking with the correct ranking of the cities’ 
size (according to the 1950 Census) as com- 
pared with the correlation of his initial rank- 
ing with the correct ranking of cities. While 
the trends in means (Table 1) were in the 
same direction as for coalescence and change, 
the results failed to attain statistical signifi- 
cance using the conservative two-tailed test. 
But this test should be considered as conserva- 
tive since the hypothesis examined was that 
the discussion and decision samples would 
exhibit higher—not merely different—means 
than the control sample. With this in mind, 
the means for the 4 treatments were com- 
pared using one-tailed ¢ tests. Each of the 
three experimental discussion and/or decision 
Means of .06, .10, and .06 was significantly 
higher at the 5 per cent level than the mean 
of the control sample (—.01) in which no 
discussion or decision was permitted. Effec- 
tiveness seemed minimized where both decision 
and discussion were absent. 


Summary and Conclusions 


Twenty groups of five subjects per group 
Were divided randomly into four treatment 
Categories. Each S twice ranked privately 10 
Sets of five cities in the order of population 
size. Treatment differences were due to dif- 
ferences in activity intervening between the 
initial and final ranking for each of the 10 
Problems. 

Five groups discussed a problem for 3 
min., reaching a group decision announced by 
One of the members. Five groups discussed 
the problem but announced no group decision. 
Five groups engaged in an irrelevant task for 
two minutes and voted secretly on the true 
tanks of the cities, after which the votes were 
Counted and the group decision announced. 
Five groups simply continued the irrelevant 
intervening task for the full three minutes and 
Teranked the cities. 


Three measures examined were: (a) coales- 
cence, or the increase in agreement among 
members of a group; (0) effectiveness, or the 
difference between initial and final accuracy 
of each member, and (c) stability, or the ex- 
tent each member did not change his opinion. 

The results indicated that: 

1. Coalescence was increased by group dis- 
cussion, group decision and most of all by 
the combination of both treatments. 

2. Change of opinion was significantly 
greater for groups permitted either discussion 
and decision, although the effect was much 
less pronounced with group decision alone. 
Again, greatest change occurred when both 
were permitted. 

3. One-tailed ¢ tests suggested that effec- 
tiveness was greater under decision and/or 
discussion treatments than when neither was 
permitted. 

The present study supports earlier findings 
concerning the efficacy of both group par- 
ticipation and group decision-making. The 
discrepancy with Bennett’s (5) results may 
be a consequence of differences in subject 
matter and criteria. 

The findings are consistent with the assump- 
tion that changes and effectiveness in groups 
primarily result from interaction among mem- 
bers. They also tend to substantiate the 
deduction that clarifying the group decision 
implements the effects of more extended in- 
teraction. 


Received March 10, 1958. 


References 


1. Anderson, K. A Detroit case study in the group 
talking technique. Personn. J., 1948, 27, 93- 
98. 

2. Bass, B. M. Outline of a theory of leadership 
and group behavior. Louisiana State Univer., 


1955. (Tech. Rep. 1, Contract N7ONR 
35609.) 
3. Bass, B. M. Development and evaluation of a 


scale for measuring social acquiescence. J. 
abnorm. soc. Psychol., 1956, 53, 296-299, 

4. Bass, B. M., Gaier, E. L., Farese, F. J., & Flint, 
A. W. An objective method for studying be- 
havior in groups. Psychol. rep, 1957, 3, 
265-280. 

5. Bennett, E. B. The relationship of group dis- 
cussion, decisions, commitment, and consensus 


408 


in “group decision.” 
251-273. 

6. Burtt, H. E. Sex differences in the effect of dis- 
cussion. J. exp. Psychol., 1920, 3, 390-395. 

7. Coch, L., & French, J. R. P. Overcoming re- 
sistance to change. Hum. Relat., 1948, 1, 512- 
I2. 

8, Levine, J, & Butler, J. Lecture vs. group de- 
cision in changing behavior. J. appl. Psychol., 
1952, 36, 29-33. 

9. Lewin, K. Group decision and social change. In 
T. M. Newcomb & E. L. Hartley (Eds.), 
Readings in social psychology. New York: 
Holt, 1947, pp. 330-344. 

10. Lindquist, E. F. The design and analysis oj 
experiments in psychology and education. 


Hum. Relat., 1955, 8, 


ir 


12. 


13. 


14. 


D. F. Pennington, Jr., F. Haravey, and B. M. Bass 


(2nd 
1953. , 

Marrow, A. J. Group dynamics in industry; im- 
plications for guidance and personnel workers. 
Occup., 1948, 26, 472-476. 

McKeachie, W. J. Individual conformity to at- 
titudes of classroom groups. J. abnorm. soc. 
Psychol., 1954, 49, 282-289. 

Miller, N. E. Learnable drives and rewards. In 
S. S. Stevens (Ed.), Handbook of experimental 
psychology. New York: Wiley, 1951. P. 468. 

Munsterberg, H. Psychology and social sanity. 
Garden City, N. Y.: Doubleday, 1914. 


ed.) New York: Houghton Mifflin, 


. Radke, M., & Klisurich, D. Experiments in 


dietetics 


changing food habits. J. Amer. 


Assoc., 1947, 23, 403-409. 


oe 


-N 


Journal of pipes ee cholo; 
Vol. 42, No. 4 > 


The Construction and Analysis of a Leadership Behavior 
Rating Form +` 


W. W. Rambo? 


Occupational Research Center, Purdue University 


Recently, four studies have been reported 
which have dealt with the factor analysis of 
a series of behavioral statements in an at- 
tempt to describe the various dimensions of 
leader behavior. In each of the four studies 
the authors report two independent factors 
which describe the relationships that were 
noted among the responses to these behavioral 
statements. One factor, which has been called 
Fairness to Subordinates (4), Social Respon- 
sibility to Subordinates and Society (9), and 
Consideration (2, 3), refers to behaviors 
which are generally considered under the title 
of human relations. The second type of fac- 
tor, which has been called Administrative 
Achievement (4), Executive Achievement 
(9), and Initiating Structure (2, 3), has 
been derived from behaviors which serve to 
define and circumscribe the behaviors of 
others. For each of the investigations a dif- 
ferent group of supervisory positions was 
used, viz. military officers (3), industrial 
foremen (2), school administrators (4), and 
industrial executives (9). 

The first phase of this study represents an 
attempt to examine the possibility of general- 
izing these two leadership dimensions to the 
behaviors observed among individuals holding 
first line and middle management positions 
in a group of midwestern industrial organiza- 
tions. In attempting this, relatively simple 
tating and item analysis procedures will be 
used in order to arrive at a rating form which 
will reflect internally consistent and mutually 
independent measures of these two dimensions 
of leader behavior. Apart from offering a re- 
Search tool which will supplement existing in- 
Struments, it is felt that the ability to con- 
Struct such a rating form in a new group of 
industrial organizations will add supportive 
Sees 


A 1 The author would like to thank C. H. Lawshe 
ya J. E. Oliver for their assistance in the various 
B ases of this research. 

* Now at Oklahoma State University. 


evidence to the findings of the above-men- 
tioned research. 

The second phase of this study deals with 
an attempt to describe the leadership behav- 
iors found within several dimensions of the 
formal organizational structure of a large 
manufacturing concern. The relationship be- 
tween leadership dimension ratings and rank- 
ings on over-all supervisory effectiveness will 
also be presented. 


Method 


The names given the two factors by Halpin and 
Winer (3) were adopted in the present investigation, 
and the following definitions were prepared: Con- 
sideration—Behavior, initiated by a person in a po- 
sition of influence, which is motivated by an aware- 
ness of the needs of the subordinates whose actions 
will be affected by this behavior; Initiating Struc- 
ture—Behavior, initiated by a person in a position 
of influence, which serves to define relationships be- 
tween a subordinate and his superiors, his peers, his 
subordinates, the goals of the group, and the mate- 
rials and equipment used. 

The literature was searched for concepts which 
might logically relate to these two dimensions of 
leadership, and the concepts selected were trans- 
lated into behavioral statements. Seventy-two items 
were prepared, samples of which follow: Considera- 
tion—Does he blame people under him for his own 
failures? During the working day is he friendly in 
his contacts with his employees? Initiating Struc- 
ture—Does he let you know that he’s the boss? 
Does he always make sure his people are kept busy? 

The items were placed in random order, and they 
were given to six judges who were asked to assign 
each item to one of four categories, i.e., Considera- 
tion, Initiating Structure, Both, or Neither. They 
were asked to read the definitions of the two dimen- 
sions of leadership and assign each item to one of 
the four categories which best described the logical 
content of the item. Each judge rated the items in- 
dependently. Items were selected for the initial 
form of the instrument if they were placed in the 
consideration category by four judges or in the 
initiating structure category by four judges. 

The ratings resulted in the elimination of 18 items 
from the original list of 72. Of the 54 remaining 
items, 26 were placed in the consideration dimension 
and 28 were placed in the initiating structure dimen- 


409 


410 


i items were placed in a rating form 
which nee dichotomous, Yes-No, response cate- 
ao preliminary form was administered to 151 
management men who represented companies in and 
around Lafayette, Indiana. Seventy-eight of the Ss 
held first-line management positions, 51 held second- 
line management positions, and 22 held positions at 
or above third-line management. For the purposes 
of analysis, the last two groups were combined. Us- 
ing the rating form, the Ss were asked to describe 
the behavior of their immediate superior. The Ss 
were instructed not to include their name or the 
name of their superior on the rating form. 

Arbitrary scoring weights of one and zero were 
used in the evaluation of the ratings, with a weight 
of one being given to responses indicative of a high 
degree of consideration or structuring, Each rating 
form, therefore, yielded two scores, one for each 
dimension, 

For each management group a separate internal 
consistency item analysis was performed on each di- 


mension (6). The Lawshe-Baker nomograph (7) 
was used to derive the item statistic, Omega, and 
this statistic was transformed i 


nto £ values, Items 
were retained if they yielded ¢ v 


alues which were 
significant at the .05 Jeve 


1 of significance on both 
internal consistency analyses, 


The proposed instrument demands not only 
of homogeneous items for each dimension, but that 
the items in one dimension be independent of the 
items in the other. Therefore, the same item analysis 
procedures were carried out, but this time the items 
in each dimension were examined in order to deter- 
mine whether they could discriminate between the 
criterion groups of the second dimension, Hence, 
each item was again subjected to two more analyses, 
one for each management level. Items were re- 
tained if they yielded at le: value which was 


ast one £ 
not significant at the 05 level. This selection cri- 
terion was felt to be stringent enough for this initial 
“screening” analysis since the items would again be 
subjected to a later analysis. 

The items that were sel 
ries of analyses were pla 
the instrument. This fo: 
management men who 
facturing division of a |: 


a group 


ected from the above se- 
ced into another form of 
rm was administered to 197 
held positions in the manu- 


arge automobile manufactur- 
ing organization. One 


hundred thirty-two of the Ss 
held positions of foreman, 47 were general foremen, 
and 18 were assistant superintendents, Two forms 
from the foreman group were incomplete so they 
were eliminated. A series of behavioral descriptions 
were obtained in each of nine units of the manufac- 
turing division of this organization These descrip. 
tions extended upwards from the foremen to the sy. 
Perintendents. Hence, using the tating form tor 
men described the leadership behaviors ot ee 
immediate superiors, the general foremen Th their 
eral foremen, in turn, described the Jp d he Hen. 
of their superiors, the assistant supsrine er bel vior 
the assistant superintendents deseri} a ents, and 
behavior of the Superintendents bed the lea i 


W: W: 


Rambo 


These descriptions reflected the formal grani 
tional structure of the company since they PREIA 
to the formally defined leader. The perceived or bes 
formal leader did not enter into the responses = 
rating form except to the extent that the formal Paa 
informal structures coincided. As a result of i 
procedure, each member of the supervisory lore say 
above the rank of General Foreman was descri a 
by all of his immediate subordinates. Since the hes 
ber of foremen within the several units was ra ie 
large, a representative sample of kramen VE ale 
lected to participate in the study. The rank an ae 
employees were not included in the data; there noi 
there were no descriptions available for the TT 
All rating forms could be identified according to of 
partment and supervisory level with the exception A 
the behavioral descriptions that referred to ges 
perintendent group. Here, at the request of be Gi 
sonnel Department, these forms = only be ide! 

ccording to supervisory level. n 
Er 5 nan the perceived ciecia 
with which each S carried out his managerial sna 
tions, a rating procedure was initiated which IE 
the direction of the previously described proce wall 
Therefore, superintendents were ranked on Orra 
supervisory effectiveness by the works sn Ah 
his assistant; all of the assistant supcerinten a 
within a given unit were ranked by the superinte a 
ent of the units and all of the general orama 
each unit were ranked by the superintendent of are 
unit. These ranks did not extend across depa in 
mental boundaries. In other words, the superior ole 
a given department ranked only those individus 
within that department who held lower positions T 
the supervisory hierarchy; he did not rank A 
viduals in other departments, This resulted in nh 
sets of ranks, one for each department. rms 

After the sets of ratings were completed, the fo ai 
were grouped according to department, and prim iy 
and hold-out groups were determined by random 
selecting departments until the N for the hold- 
group approximate: 


Ninety-seven Ss wı 


A up. 
ere included in the primary group: 
A 


second item analysis was performed on the at 
in the rating form; however, there was no oer 
to carry out separate analyses for each manageme, 
level, Therefore, following basically the same Eaa 
cedure that was previously employed, the items v 
examined for internal consistency, Items were s 


tained that exhibited £ values significant at the 
level of significance, 


_ Next, each item Was examined for independe 
from the second dimensi 


jid 
display sitit on, Items which die 
N Tea Aisctiminability. were ret pat 
‘ons Completion of tho analysts each $ vot 
i patate item analyses, nee 
isleney and three for indepe™ 


nce 
no! 


Results r net 
€ first series of item analysa, om 
i OSS of 14 of the 54 items W e inst 
eluded in the first form of t 


S 


4 


S 


d the N for the primary grouP: ~ 


Leadership Behavior Rating Form 


The final series of item analyses yielded an 
instrument which was composed of 29 items, 
12 items measuring behavior described by the 
initiating structure dimension and 17 items 
yielding a measure of the consideration di- 
mension.’ 

The forms in the hold-out group were 
scored using the scoring key which emerged 
from the final series of item analyses, and a 
split-half reliability coefficient was computed 
for each dimension. The assignment of items 
to the halves was determined randomly. A 
coefficient of .73 was obtained for the con- 
sideration dimension. Stepping up this co- 
efficient by the Spearman-Brown technique 
yielded an estimated coefficient of .84 for a 
form twice the length of one of the halves. 

The split-half reliability estimate obtained 
from the items in the initiating structure di- 
Mension was .79. Stepping up this value re- 
sulted in an estimate of .88. 

A coefficient of interrater agreement (5) 
Was computed for each dimension in order to 
estimate the consistency with which raters 
describe the behavior of a given ratee. This 
Statistic permits an estimation of the reli- 
ability of an instrument in situations in which 
there is an unequal number of observations 
recorded for the various individuals on whom 
the instrument is applied. Coefficients of .73 
and .57 were obtained for the consideration 
and structure dimensions, respectively. 

In order to arrive at an estimate of the de- 
gree of relationship existing between the two 
dimensions, a Pearson r was computed from 
the dimension total scores. A correlation of 
.02 was obtained. This r was not significant 
at the .05 level of significance. 

Since the foregoing analysis was concerned 
with the internal characteristics of the instru- 
ment, the second phase of the study was car- 
ried out on the combined primary and hold- 
Out groups. 

One of the purposes of this study was to 


determine whether or not there are differences 
— 

"A 4-page table giving items and item statistics 
from the final series of item analyses has been de- 
Posited with the American Documentation Institute. 
Order Document No. 5770, from ADI Auxiliary Pub- 
lations Project, Photoduplication Service, Library 
of Congress, Washington 25, D. C., remitting in ad- 
Vance $1.25 for microfilm or $1.25 for photocopies. 
S ake checks payable to Chief, Photoduplication 
ervice, Library of Congress. 


411 


Table 1 


Analysis of Variance for Consideration Dimension 
Scores Classified with Respect to the Hori- 
zontal and Vertical Organizational Axes 


Mean 
Source df Square F 
Levels 1 1.22 <1.00 
Departments 8 20.38 2.14* 
DXL 8 10.38 1.09 
Error 159 9.52 
Total 176 


* Fos = 1.98 (8, 159 df). 


in the leadership behaviors that are observed 
along several dimensions of the formal or- 
ganizational structure. A two-way classifica- 
tion analysis of variance was performed in 
order to arrive at some estimate of these for- 
mal dimension effects. The horizontal and 
vertical axis of the company’s organization 
chart formed the two axes of the analysis of 
variance. However, there was one limitation 
imposed on this analysis because it was im- 
possible to identify the department super- 
vised by each superintendent. This impelled 
the exclusion of this group from the present 
analysis. Tables 1 and 2 present the results 
of the analysis of the scores obtained from 
the two dimensions. 

Examination of Table 1 will demonstrate 
that the F for supervisory levels did not at- 
tain significance at the .05 level. However, 
the test for differences between department 
consideration scores did result in an F value 
which was significant at the .05 level. This 
indicates that as one moves across depart- 
ments, the behaviors which are described by 
the consideration dimension demonstrate sig- 
nificant fluctuations. Therefore, it may be 
said that this aspect of supervision is not con- 
sistent among the several horizontal units of 
the formal organizational structure. 

Table 1 also indicates that the F value for 
interaction between departments and super- 
visory level did not reach the value required 
for the .05 level. Therefore, the analysis was 
not able to demonstrate a differential effect 
of departmental factors as a function of the 
particular level of supervision observed. In 
terms of group averages, it appears that mem- 


412 


Table 2 


i i i ion Scores 
lysis of Variance for Structure Dimension 
es “Classified with Respect to the Horizontal 
and Vertical Organizational Axes 


W. W. Rambo 


Table 3 


Analysis of Variance for Consideration Dimension 
Scores Classified with Respect to Leader- 
ship Patterns and Rank Category - 


Mean 


Mean FR 
Source df Square F Source df Square 
5 1.00 
1 1.26 <1.00 Pattern 3 8.54 < 

nt 8 7.70 2.24* Rank Category 2 13.04 mo 
DXL 8 5.74 1.69 PXR 6 8.20 <i 
Error 159 3.38 Error 118 11.13 

Total 176 Total 


129 


* F.o = 1.98 (8, 159 df). 


bers within any one department were func- 
tioning under the same degree of considerate- 
ness. 

The results of a similar analysis on the 
structuring dimension scores are presented in 
Table 2. Here it can be seen that the results 
parallel those reported for the preceeding 
analysis of the consideration dimension. 

Since it was desirable to determine whether 
there was a significant difference between the 
mean dimension scores obtained from the su- 
perintendent level and the two levels which 
have just been analyzed, and since this last 
analysis did not indicate a significant differ- 
ence between the two levels, mean scores for 
both dimensions were computed from the pool 
of these two levels. A ¢ test was performed 
in order to determine the significance of the 
difference between these pooled means and 
the mean dimension scores of the superin- 
tendent level. For both levels the #’s were 
not significant at the .05 level, 

For each dimension a mean score was com- 
puted from the several descriptions obtained 
of the leadership behavior of each assistant 
superintendent, and these means were ar- 
ranged according to order of Magnitude. The 
two resulting continua were employed to form 
the two axes of a scattergram. Each axis was 
dichotomized at the median, thus yielding a 
four-cell table which reflected four patterns 
of leadership behavior. Under each Pattern 
was placed the average scores obtained on a 
dimension by the general foreman who served 
under these four patterns of leadership. The 
four groups of general foremen were further 


classified according to their rankings on over- 
all supervisory effectiveness. Three classifi- 
cations were used. In departments which 
had six or more general foremen, the top and 
bottom two ranks were assigned to the “good 
and “poor” classifications, and the remaining 
Ss were placed in the “average” classification. 
For departments which had fewer than six 
general foremen the top and bottom extreme 
rank was assigned to the good and poor cate- 
gories, and the remaining ranks were assigned 
to the middle category. ji 
For both dimensions, a two-way analysis of 
variance was performed for an R x C table 
with disproportionate subclass numbers (10). 
In the event of a significant F ratio for pat- 
terns on either dimension, orthogonal com- 
parisons were planned. T 
Table 3 presents the results of the analysis 
of variance of the consideration dimensi 
Here it can be seen that the F values for al 
main effects and interactions were not signifi- 


Table 4 


Analysis of Variance for Structure Dimension Scores 
Classified with Respect to Leadership 
Patterns and Rank Category 


Mean : 
Source df Square F 
Pattern 3 15.67 4.62" 
Rank Category 2 5.47 1.61 
PXR 6 6.20 1.81 
Error 118 3.39 
Total 129 


* Fa = 3.95 (3, 118 df). 


E 


Leadership Behavior Rating Form 


Table 5 


Orthogonal Comparisons for Structure 
Dimension Analysis 


Mean 
Contrast dj Square EF 

Hi Structure vs. Lo Structure 1 27.99 8.26* 
Hi Structure-Hi Consideration 

vs. Hi Structure-Lo Consid- 

eration 1 2.94 <1.00 
Error 118 3.39 

Total 120 


* Fam = 6.86 (1, 118 df). 


cant at the .05 level. Table 4 presents the 
results of a similar analysis of the structur- 
ing dimension scores. Here it can be seen 
that the F values for rank categories and in- 
teraction did not attain significance, but the 
value for patterns was significant at the .01 
level of significance. This led to the analysis 


of the two meaningful orthogonal comparisons. ` 


Table 5 presents the results of these com- 
parisons. It can be seen that the F for the 
Comparison of the groups which included the 
high structure leadership patterns as com- 
pared with the general foremen serving under 
a “Jow” degree of structure was significant at 
the .01 level. This result indicates that as- 
sistant superintendents who were high in 
structure tended to have general foremen 
serving under them who were also high in 
structuring behavior. Assistant superintend- 
ents who were low in structure had general 
foremen serving under them who were also 
low in the amount of structure they afforded 
their subordinates. 

The second comparison presented in the 
Table 5 indicates that the degree to which 
assistant superintendents who are high in 
structure manifest behavior that is described 
by the upper or lower segments of the con- 
Sideration continuum does not influence the 
descriptions of the structuring behavior which 
Was obtained from the ratings of the general 
foremen group. 


Discussion 


The construction of the rating form was 
ased on the assumption that the importance 


413 


of two dimensions of leadership could be gen- 
eralized beyond the situations in which the 
defining factor analyses were performed. The 
judge’s ratings of the initial list of items 
yielded two groups of items which were 
judged to be compatible with one of the defi- 
nitions of the two dimensions. The item 
analysis techniques further “purified” these 
two groups by yielding two homogeneous 
groups of items which were not significantly 
related to each other. 

It is felt that the results of the first phase 
of this study offer support to the four previ- 
ously mentioned studies that used factor 
analysis techniques to describe leader behav- 
ior. Factor analysis is primarily a descrip- 
tive device which does not permit the estab- 
lishment of confidence intervals for rotated 
factor loadings. Therefore, studies which 
demonstrate the emergence or utility of a 
particular factor definition in a new situa- 
tion will lend evidence which will support at- 
tempts to generalize this dimension beyond 
the situation in which the original factor 
analysis was performed. 

Likert and Katz (8) have reported experi- 
mental results which tend to agree with a 
two dimensional concept of industrial leader- 
ship behavior. These authors have identified 
two types of supervisory behavior, behavior 
that is essentially “employee centered” and 
behavior that is “production centered.” These 
two classifications of behavior were obtained 
from a series of survey and interview pro- 
cedures, and an examination of the behaviors 
which have been related to these two super- 
visory “types” will reveal a close similarity 
to the two dimensions of leadership which 
are the topic of this investigation. It seems, 
therefore, that there is considerable evidence 
in the literature which tends to support a 
two-dimensional conception of leadership be- 
havior. 

The analysis of the data which was ob- 
tained from the several supervisory levels, the 
vertical axis of the formal organization, indi- 
cates that there were no significant leader be- 
havior differences existing between these lev- 
els. Since this axis refers to the authority 
hierarchy found in the company, it may be 
said that these two dimensions of leadership 


7 7 
diá W. W 
do not significantly vary with varying degrees 
of authority found at the three levels ob- 
seuer vertical axis also reflects a scale of in- 
creasing responsibility within the organiza- 
tion. The responsibilities which have been 
formally defined for any one particular level 
may be thought of as a generalization of the 
responsibilities found at the lower levels of 
this axis. It is a generalization of the more 
or less specific responsibilities found at these 
lower levels. The results of this analysis in- 
dicate that increasing scope of formal re- 
sponsibility is not accompanied by significant 
changes in these two dimensions of leader 
behavior. 

The analysis performed on the horizontal 
or departmental axis indicated significant de- 
partmental differences in the scores on the 
two dimensions. Hence, as one moves from 
one department to the next he moves across 
different leadership behaviors. It must be re- 
membered that this analysis was performed 
using only two levels of supervision; the in- 
terpretation of these findings is made within 
this limitation. 

The significance of the above findings 
should be interpreted in the light of the non- 
significant interaction between departments 
and supervisory levels, This nonsignificant 
interaction term indicates that the depart- 
mental differences observed were not depend- 
ent upon the particular level of supervision 
that was involved in a given comparison. 
Therefore, even though significant depart- 


mental differences exist, these differences were 
rather consistent for the t: 


wo levels within the 
departments. This intradepartmental consist- 


ency might well be explained by the leader- 
ship climate concept that has been posited by 
the Ohio State investigators (1, 2). Within 
each department there seems to exist a type 
of supervisory leadership that is found at 
each level of supervision. At first glance it 
would seem that the superior-subordinate re- 
lationships establish a “climate” in which 
rather consistent or similar forms of leader 
behavior emerge. This might be a function 
of the dependent relationship which exists be- 
tween the industrial supervisor and his sub- 
ordinates. However, the analysis of the leader 


. Rambo 


behavior found under the four different pat- 
terns of leadership does not completely bear 
out this explanation, at least within the frame- 
work of the formal organizational structure. 
It will be recalled that on only the struc- 
turing dimension was there some evidence 
of superior-subordinate behavior similarities. 
Here it was found that assistant superintend- 
ents who fell into the high structure group 
tended to have subordinates who were also 
high on this dimension score. The Ohio State 
studies found this relationship on both di- 
mensions. ; 
The most apparent explanation for the dis- 
crepancies between these two studies is = 
fact the Ohio State study used the informa’ 
leadership relationships while this study was 
concerned with the formal structure. This 
lack of agreement between the two a 
might offer some insight into the nature © 
the factors which result in this climate. Bas 
cally, the behaviors that compose the initiat- 
ing structure dimension relate to a given su- 
perior’s expectations. That is, they relate to 
what he expects of his subordinates. Hence; 
due to the dependent relationship existing be- 
tween the two, in order for these expectations 
to be fulfilled the subordinate must reflec 
this structure down to his own subordinates- 
For example, if the superior requires that Be 
subordinates report to him concerning t 
Progress of the work, the subordinate, in = 
der to comply with this requirement, mus 
expect the same from his subordinates. Sine 
the dependent relationship existing between 
the two is primarily defined by the formal oF 
ganization chart (it is the formally ries 
superior who generally evaluates a subor a 
nate and plays an important role in ae 
mining pay raises and promotions), it amaa 
expected that these behavioral ean 
will be reflected in the formal superior Aie 
ordinate relationships, However, unless m 
Superior actually inspects or audits subor A 
nate supervisory behavior with respect . 
Considerateness, this leadership similarity 
would not be expected in the formal stee 
ture. The present results seem to indica 
that considerate behavior is not a function o 
the formal superior-subordinate interactions: 
The nonsignificant F values obtained fro™ 


i 


Leadership Behavior Rating Form 


the analysis of the two dimensions of leader 
behavior with respect to the three classifica- 
tions of supervisory effectiveness present some 
information which concerns the evaluation of 
the general foremen group. The ranks can 
be thought of as a representation of the 
evaluational perceptions of the superintend- 
ent group. The scores obtained by the gen- 
eral foremen on the two dimensions represent 
a description of the behavior that is directed 
at their subordinates, the foremen. The re- 
sults indicate that the estimates of managerial 
effectiveness are not related to supervisory 
behavior as described by those actually su- 
Pervised. Hence it is probable that the inter- 
action of the superior and his subordinates 
plays a more significant role in this evalua- 
tion than the relation between the subordi- 
nate and his own subordinates. 


Summary 


A study has been reported the first phase 
of which deals with the construction of a rat- 
ing form which purportedly reflects two di- 
mensions of leadership behavior. These two 
dimensions called Consideration and Initiat- 
ing Structure were derived from research re- 
Ported by Halpin and Winer (3), Hobson 
(4), Rupe (9) and Fleischman et al. (2). 
Item analysis procedures were employed in 
an attempt to obtain a series of behavioral 
statements which were internally consistent 
within a given dimension and yet were inde- 
pendent of the statements in the second di- 
mension. A rating procedure which required 
a “logical analysis” of the statement content 
was employed to aid in the interpretation of 
the items which survived the item analysis. 

Stepped-up split-half reliability coefficients 
of .84 and .88 were obtained for the considera- 
tion and initiating structure dimensions, re- 
spectively. The two dimensions intercorre- 
lated .02 which was not significant at the .05 
level of significance. 


415 


The second phase of the study deals with 
the analysis of the scores obtained from the 
above instrument in relation to the formal 
organizational structure of a large manufac- 
turing concern. Significant behavioral varia- 
tions were observed along the horizontal axis 
of the company, but not up the vertical axis. 
Some evidence is presented which supports 
the leadership climate concept. 

The results indicate that there is no rela- 
tionship existing between scores on the two 
dimensions and rankings of over-all super- 
visory effectiveness. 


Received March 14, 1958. 


References 


1. Fleischman, E. A. Leadership climate, human 
relation training and supervisory behavior. 
Personn. Psychol., 1953, 6, 205-222. 

2. Fleischman, E. A., Harris, E. F., & Burtt, H. E. 
Leadership and industrial supervision. Ohio 
State Univer., 1955. (Personn. Res. Bd. 
Monogr., No. 3.) 

3. Halpin, A. W., & Winer, B. J. Studies in air- 

crew composition III: The leadership behav- 
ior of airplane commanders. Teck. Rep. 3, 
Human Research Laboratories. (Mimeo.) 

4. Hobson, R. L. Some psychological dimensions 
of academic administrators. Unpublished doc- 
toral dissertation, Purdue Univer., 1948. 

5. Horst, P. A. A generalized expression for the 
reliability of measures. Psychometrika, 1949, 
14, 21-23. 

6. Kelley, T. L. The selection of upper and lower 
groups for validation of test items. J. educ. 
Psychol., 1939, 30, 17-24. 

7. Lawshe, C. H., & Baker, P. C. Three aids in the 
evaluation of the significance of the difference 
between percentages. Educ. psychol, Measmt, 
1950, 10, 263-270. 

8. Likert, R., & Katz, D. Supervisory practices and 
organizational structures as they affect pro- 
ductivity and morale. Amer. Mgmt Ass. Per- 
sonn. Ser., 1948, No. 120. 

9. Rupe, J. C. Some psychological dimensions of 
industrial executives. Unpublished doctoral 
dissertation, Purdue Univer., 1950. 

10. Snedecor, G. W. Statistical methods. 
Ames, Iowa: Iowa Univer. Press. 


(5th ed.) 


Journal of Applied | 


Psychology 
Vol. 42, No. 6, 1958 


Interdependence of Successive Absolute Judgments 


Warren W. Willingham? 


U.S. Naval School of Aviation Medicine 


The simple numerical scale is one of the 
most widely used rating methods. The ob- 
server judges each stimulus independently and 
assigns a numerical scale value to it. Conse- 
quently, this method falls into the broad psy- 
chophysical category of absolute judgment. 
Ideally we would prefer that the individual 
judgments be truly independent. However, 
we are well aware that the judgments will be 
determined to a great extent by the over-all 
stimulus context, and a true absolute scale 
does not exist. On the other hand, when we 
are dealing with a single group or class of 
stimuli all being judged in the same context, 
we are not concerned with this type of rela- 
tivity and we assume that for our Purposes, 
the judgments are independent. It will be 
noted that this assumption implies that there 
are no successive response biases. It implies 
that the order of stimulus presentation is of 
no consequence, 

The results of a recent study by Garner 
(1) cast considerable doubt 
tion. Garner’s observers ma 
solute judgments on the loudness level of 
tones. Among other things it was found that 
a middle range stimulus would tend to be 
judged loud if the Preceding stimulus was 
loud and weak if the Preceding stimulus was 
weak. Two possible explanations for this 
bias are as follows, First, it may be due to 
a simple response bias similar to number 
guessing habits. Secondly, the bias may be 
a sensory phenomenon in that the observer 
may be responding to a summation of the im- 
mediate stimulus and the aftereffect of the 
preceding stimulus. If this latter explanation 
is correct, we would expect the bias only in 
judgments which are predominantly sensory 
in nature. Whereas, if the biases are re- 
sponse habits, we might find such biases in 
any judgment, be it of fact, of v 


on this assump- 
de successive ab- 


alue, or 


1 Opinions expressed here are those of the author. 
They are not to be construed as necessarily reflecting 
the views or endorsement of the Navy Department. 


whatnot. It is these more cognitive judg- 
ments which typify most rating situations. 
It was the purpose of this experiment to 
determine whether successive absolute judg- 
ments of a cognitive nature are interdepend- 
ent, and, if so, to evaluate a method for con- 
trolling this bias. 


ethod 


The rating task which was selected involved me 
ing the populations of countries. At its simplest S 
design required that two independent groups of > 
rate the population of some “test” country; om 
group having just rated a sparsely populated Sih 
try, the other group having just rated a pogues 
country. For example, one group rates Canada al j 
Panama and another group rates Canada after enne: 
If an immediately preceding rating biases a siba 
quent rating, this bias should be reflected in e 
gent mean ratings of Canada by the two ae 
This type of design was incorporated several ae 
in longer lists of words as illustrated in Fig. 1. a 
A was administered to one group of Ss and List ra 
to another group. Test Item a, Canada, was y 
fourth position on both lists. Siam was Test Item wf 
Burma was Test Item c, and so forth. There bin 
eight such test items incorporated in the comple! 
list of 26 countries, Satter 

Six groups, each of about 65 Naval sata 
Cadets, served as Ss. The total N was 387. De 
groups rated List A using a 5, 9, and 20 point scale, 


Sequential 
position List A List B 

1 Holland Holland 
2 Greece Greece 
3 Panama China 
4 Canada (test item a) Canada 
5 Rumania Rumania 
6 China Panama 
7 Siam (test item b) Siam , 
8 Argentina Argentina 
9 Tceland India 

10 Burma (test item ¢) Burma 
260 Brazil Brazil 

Fic. 1. Experimental lists employed (each with 


a5, 9, and 20-point scale). 


416 


en 


Interdependence of Successive Absolute Judgments 


Table 1 


Differences Between the Mean Ratings of the Test Items 
as Rated on Lists \ and B (Mean Rating Following 
a Populous Country Minus Mean Rating Fol- 
lowing a Sparsely Populated Country) 


Revised 


Test 5-Point 9-Point 20-Point 20-Point 
Item Scale Scale Scale Scale 
a =i —0.0 06 ° —0.7 
b a 0.0 0.6 0.9 —0.6 
c 0.1 0.6 2:3" 0.1 
d 0.1 0.7% 13 0.2 
e 0.1 oie 20% 0.6 
# —0.1 0.4 i | 32 
g —0.0 —0.1 —1.5* -1.0 
h 0.2 0.4 R aia 0.3 
Aver. Ña 0.03 0:35%* 1.08* 0.01 
*p = 05. 
p = 01: 


respectively. Three groups rated List B using a 5, 
9, and 20-point scale, respectively. The idea of rat- 
ing the population of each country on an 2 point 
scale was briefly explained. Then the Ss were sim- 
ply told to go down the list and rate each country 
according to where it should stand in relation to all 
countries in the world. 


Results 


The results obtained for the eight test items 
are shown in the first three columns of Table 1. 
The entries in this table are differences be- 
tween the mean ratings of the paired test 
items in Lists A and B. Thus, the first en- 
try in Table 1 (— 0.1) is the difference ob- 
tained by subtracting the mean rating for 
“Canada following Panama” (List A) from 
the mean rating for “Canada following China” 
(List B). The mean rating which followed a 
sparsely populated country was always sub- 
tracted from the mean rating which followed 
a populous country. The first column of 
Table 1 indicates that the five-point scale 
showed no bias effect. In the case of the nine- 
Point scale the differences are generally posi- 
tive, which indicates a tendency to rate the 
test item in the direction of the previous rat- 
Ing. Using a two-tailed ¢ test with seven de- 
Brees of freedom, the over-all effect is associ- 
ated with the .01 level of significance. The 
20-point scale shows an even larger effect in 


417 


the same direction. These results agree with 
those of Garner in showing a shift of the sub- 
jective scale away from the previous rating. 
The data are also in agreement with those of 
Garner in showing an increasing effect as the 
number of response categories increases. 

Two additional groups of Ss were tested 
to determine whether revised directions could 
mitigate this bias. Groups of 62 and 66 Ss 
rated the countries with Lists A and B re- 
spectively using a 20-point scale. The experi- 
mental conditions remained constant except 
for one additional instruction. After the 
regular instructions the experimenter Com- 
mented that the Ss could probably do a bet- 
ter job if they would rate the high ones first, 
the low ones next, and then the ones in the 
middle. The fourth column of Table 1 shows 
the results under this condition. None of the 
individual items showed a significant differ- 
ence between the two forms, and the over-all 
effect is close to zero. 


Discussion 


As we have previously mentioned, the re- 
sults of this study agree closely with results 
obtained by Garner. That writer had sug- 
gested that the effect might be due to either 
a judgmental bias or a sensory phenomenon. 
Since the judgment involved in this study is 
essentially a question of factual knowledge, 
it is very doubtful that the bias is sensory in 
nature. 

On the surface it would seem that this bias 
is closely related to the anchoring effect in the 
framework of adaptation level theory. How- 
ever, this similarity is more superficial than 
real. It will be remembered that the effect of 
introducing an anchor stimulus before each 
stimulus to be judged is to extend the psy- 
chological scale toward the anchor stimulus. 
That is, numerical ratings tend to be smaller 
after a large anchor and larger after a small 
anchor. Thus, the effect is exactly opposite 
to the effect we have obtained. The same is 
true if we consider the over-all context effect, 
A stimulus will be judged small in a context 
of large stimuli and large in a context of small 
stimuli. It appears likely that the bias dis- 
cussed here operates independently of the 


418 Warren W 
contextual effects handled by adaptation level 
theory. 


Summary 


One of several different forms of a rating 
sheet was administered to 515 Ss. The re- 
sults indicated that ratings tend to be biased 
in the direction of the previous rating, and 
that the bias increases as the number of re- 


. Willingham 


sponse categories increases. No bias effect 
was found when the Ss were instructed to 
rate the extreme stimuli first. 


Received March 17, 1958. 


Reference 


1. Garner, W. R. An informational analysis of ab- 
solute judgments of loudness. J. exp. Psy- 
chol., 1953, 46, 373-380. 


Journal of Applied Psychology 
Vol. 42, No. 6, 1958 


The present study grew naturally out of 
four earlier investigations (1, 2, 3, 4) on the 
identification of cola beverages by taste. In 
all of these studies, Coca Cola, Pepsi Cola, 
and R. C. Cola were used. In one experi- 
ment, a fourth, relatively unknown cola 
brand was added. The over-all results in 
the series showed that the same responses 
were “emitted” whether three different bev- 
erages or four different ones were given, 
whether there were three of the same brand 
or four of the same brand, whether they were 
actually the leading brands or practically un- 
known ones and whether Ss were told they 
would sample Coca Cola, Pepsi Cola, R. C. 
Cola, and were actually given these beverages, 
or when they were not told what the cola 
drinks were and they were the three leading 
brands, or whether they were told that they 
would be given the leading brands and were 
given some unknown cola or colas. 

Diverse conditions, then, elicited a tedious 
sameness of response and the 645 Ss observed 
to date appeared to respond obsessively with 
the easily triggered phrase, “Coca Cola, Pepsi 
Cola, R. C. Cola.” 

We, therefore, speculated about the pos- 
sible pattern experimental Ss might give if 
they were shown visual cola stimuli presented 
tachistoscopically. What would happen if 
the various cola bottles, bottle caps, and 
brand names were shown at fast exposures? 
Would there be any relationship between the 
visual discriminations of a group of Ss and 
the gustatory discriminations of the Ss of 
our earlier studies? These questions permit 
answering the hypothesis previously suggested 
in Studies II and III regarding the probable 
effectiveness of advertising and the prevalence 
of cola stimuli in the cultural media, with 
the conjecture that these factors make for a 
a Grateful acknowledgment is made to Margaret 
Habein, Dean of Liberal Arts, for her generous sup- 
port of this study, to James Rutherford for his help 
in collecting the data, and to J. F. McGovern for 


facilitating the tabulation and processing of the data 
onto IBM cards. 


Identification of Cola Beverages: V. A Visual Check * 


G. Y. Kenyon and N. H. Pronko 


University of Wichita 


readiness-to-respond with a “signal reaction” 
incorporating the three “leading” brand 
names. These hypotheses would be sup- 
ported if the visual dice are loaded the same 
way as the gustatory ones were. 


Procedure 


The Ss of the present study consisted of 
210 students (6 groups of 35 each) from the 
Elementary Psychology courses. After they 
were seated in a Visual Aids screening room, 
they were each given a Data Sheet and were 
asked to fill it out, giving name, sex, and other 
information that was thought to be relevant. 
The room was then darkened but not enough 
to prevent further recording of responses on 
the sheet and the following instructions were 
then read: 

“A series of slides will be flashed on the 
screen at the front of the room. They will 
appear and disappear very quickly. We 
would like your cooperation in trying to see 
what appears there for a fraction of a second 
and in recording what you see. You will 
have just time enough to record your re- 
sponse and to get ready for the next slide 
which will be flashed immediately after you 
hear a ‘Ready’ signal. Naturally, we are 
interested only in your own reaction. This 
is not a test of intelligence and there is no 
right or wrong answer. Whatever you see is 
correct, so please record only what you see 
without regard to what you might see others 
write. Please do not talk to anyone about 
this study either during the experiment or 
afterwards.” 

Prepared 35 mm. color slides were then 
projected on a screen for 1/400 sec. at ap- 
proximately 15-second intervals. There was 
a total of 45 slides in three stimulus cate- 
gories, 15 of them with color reproductions 
of bottles, 15 with color reproductions of 
bottle caps, and 15 were black and white 
slides containing typewritten brand names, 
Coca Cola, Pepsi Cola, and R. C. Cola (sic), 


419 


G. Y. Kenyon and N. H. Pronko 


420 


[op isdog 


parus VOD Jo purug 


BoD Da 
iii mae emma a SE 
purig 19470 aWOog sv payruaprstpy 10 A}q991105 PƏYNUƏƏPI SEM 
‘sdeg amog ‘SINOJ L0 Jo SuoneLzuISq PIS YSL 0} SÇ QIZ Jo sasuodsay uoneoynuapy 


AWUN pubig V SWL, Jo IQUNN SuLmoyg SIULN purag pod yz, pur 


T QLL 


prec O 
css 6 697 OFS PLT PL9 cl 6I 9¢9 66S I FOL Ol Jer oze 9761 yog 
SFO 8 EFT 189 €F6 ccs L SOT SIS S9TT 84S 6 901 £67 CEST seq 
£07 I 97 SOT IET cSt S IZ 1cl ree 9ST I ST FE Foe sa Surg mags Ty 
cL | SE ££ 606 t9 I 6l SP STS 06 0 CC 9¢ TSS mog 
4S 0 re Ze LIL LS I Ool SF S19 cL 0 CC 9S 069 smeg En 
SI rT E g at L 0 0 ¢ 007 SI 0 0 0 261 sejsulg puriq pod 
OFS I i) aes 3 907 SLE I cl I$ 84S OSPF Lá SF  O6I SSE: yog 
60F I SPI TII FLI 6L7 I cI $ PLE ore € ee 891 90¢ sued 
OFI 0 oF gg ce 66 0 0 Z FOI 91 I cl CE OF sqjsuig  sdvə əmog 
IEZ Z €Z 089 OS CEC ol S6 = ZOS £0 SIZ 9 FS es 689 mnog 
6L1I Z F9 = SEs cs 9S1 S LL  96£ 941I 941I 9 IS IZ 9¢¢ se 
cs 0 6 ch Z OF tç E WI LT CP 0 £ cl est Sa[SUIS SANOA 
‘dsaxy ‘sı rsdəq *9"9 19911075 ‘dsay CSI “Oy a5 139110 ‘dsay SIW Y'A dəq 99.1107) snjnuins 
ON ON ON jo ainjeny 
99.1109UT 99.1109UT 9a1109UT 
scanner A E A REEE 
RIOD 209 


np the same for all groups. 


Identification of Cola Beverages: V. A Visual Check 


Bottle, cap, and brand name slides were each 
subdivided as follows: Each stimulus cate- 
gory had three of the stimuli presented singly 
while six of them were shown paired in coun- 
terbalanced order and six of them were re- 
produced as triples in counterbalanced order 
so as to control for possible position effects. 
The bottles and caps were presented side by 
side while the typewritten brand names were 
presented in vertical pairs or triples. Pres- 
entation of bottles, caps, and brand names 
was also given in a counterbalanced order to 
__ the six groups so that each stimulus category 


_ appeared in a first, second, or third order the 


same number of times. However, the posi- 
_ tion of each slide within. the category was 


It took about 35 
= Minutes to run a group of Ss. 


Results and Discussion 


Table 1 indicates that for Bottle Stimuli 
(singles and pairs) Coca Cola is most accu- 
rately identified. The reader will note that 
Coca Cola was correctly identified 689 times 
compared with Pepsi Cola which was cor- 
rectly identified 203 times and R. C. Cola 
only 59 times. It is also apparent that both 
Pepsi and R. C. were misidentified as Coca 
Cola more frequently (507 and 680 times re- 
` spectively) by contrast with Coca Cola’s mis- 


_ identification as Pepsi Cola only 83 times and 


R. C. Cola 54 times. R. C. Cola trails far 
behind in these respects. 

The data for bottle caps indicate an advan- 
tage for Pepsi Cola when presented both 
singly and paired with one of the other 
brands since the totals of correct identifica- 
tion for Pepsi are 578, for Coca Cola 355, 
and for R. C. Cola 206. It appears that 
bottles are favored in the order: Coke, Pepsi, 
) and R. C.; and caps, Pepsi, Coke and R. C. 
_ These results mean that, with respect to the 


_ Competing brands, the Coke bottle and the 


Pepsi Cap are leaders in each of their cate- 
sores. Consistent with this, it is interesting 
to note that when the Coke cap is misidenti- 
hed, it is Misidentified as Pepsi more fre- 
ay (190 times) than Pepsi is misidenti- 
= oo a ee times). For the sum of 
Coke (133), led as Pepsi (161) than as 


421 


The data for the slides of the typewritten 
brand names are somewhat different from 
those for bottles and caps. They differ in 
that typed brands presented singly were such 
an easy perceptual task that practically no 
errors appeared. The greater ease of identi- 
fying the typewritten brand names is also 
evident when these materials were shown 
paired in competition with one another. 
However, sufficient errors were made here to 
indicate that under this condition R. C. has 
a greater advantage. For both single and 
paired presentation, R. C. yields 909 correct 
identifications followed by Coca Cola’s 882 
and Pepsi’s 818. These results may indicate 
that our use of the abbreviated form of the 
Royal Crown name gave it a perceptual ad- 
vantage over the other two brands under our 
conditions. 

The tabular data show differential advan- 
tages for the three brands tested depending 
on which stimulus aspect was presented. 
However, when the total of all stimuli are 
considered, then the order of frequency of use 
of the brand names is Coca Cola, Pepsi Cola, 
R. C. Cola. The results here are consistent 
with the findings of our previous studies. 
Though there is some tendency toward inter- 
action based upon the nature of our stimulus 
presentations, in general the hypothesis that 
the brand identifications reflect a ‘‘readiness- 
to-respond”’ dependent upon advertising ap- 
pears to be supported. The results of 
Prothro’s (5) study also support this inter- 
pretation. The results are also suggestive of 
other well known readiness-to-respond reac- 
tions which were related to early and mas- 
sive advertising of such brand name products 
as Victrola and Frigidaire. 

The present experiment is related to the 
currently popular area of subliminal percep- 
tion and suggests that lesser known brands 
presented by rapid exposure may readily be 
misidentified as products with more familiar 
brand names. 


Summary 


A group of 210 Ss was asked to identify 45 
tachistoscopic slides presented at 1/400 sec. 
exposure. The slides contained colored trans- 
parencies of Coca Cola, Pepsi Cola and R. C. 
Cola (a) bottles, (b) bottle caps, and (c) 


422 


typewritten brand names. All three cate- 
gories of stimuli were presented either singly, 
in pairs, or as triples, the latter two in coun- 
terbalanced order. None of the triples data 
was analyzed for this report. The over-all 
results indicate that our Ss showed a greater 
accuracy in responding to the brand names 
Coca Cola, Pepsi Cola, and R. C. Cola in 
that order. They also reveal that when mis- 
identifications were,made, they occurred in 
the same s e. When the categories 
were analyzed separately, it was found that 


the orders of decreasing. frequency for accu- 
rate identificati misidentifications 
were as follows: bottle oca Cola, Pepsi 


Cola, R. C. Cola; bottle caps—Pepsi Cola, 
Coca Cola, R. C. Cola; typewritten brand 
name—R. C. Cola, Coca Cola, Pepsi Cola. 
These findings were related to Previous cola 
studies and are believed to support the hy- 


pothesis that idea onfof cola beverages 


a. ia 


G. Y. Kenyon and N. H. Pronko 


is more related to the extent and specific. na- 
ture of advertising than to taste, giving cer- 
tain brands under certain stimulus conditions 
a favored position in regard to a readiness- 
to-respond with a particular brand name. 


Received August 18, 1958. 
Early Publication. 


References 


1. Bowles, J. W., Jr, & Pronko, N. H. Identifica- | 


tion of cola beverages: II. A further study. 
J. appl. Psychol., 1948, 32, 559-564. 

2. Pronko, N. H., & Bowles, J. W., Jr. Identiñca- 
tion of cola beverages: I. First study, J. 
appl. Psychol., 1948, 32, 304-312. 

3. Pronko, N. H, & Bowles, J. W., Jr. Identifica- 
tion of colašbeverages: III. A final study. J. 
appl. Psychol., 1949, 33, 605-608. á 

4. Pronko, N. H., & Herman, D. T. Identification 
of cola beverages: IV. Postscript. J. appl- 
Psychol., 1950, 34, 68-69. i 

5. Prothro, E. T. Identification of cola beverages 
Overseas. J. appl. Psychol., 1953, 37, 494-495. 


n 
t 
[s 


1% 


ga T P 


panna 
Pia — 
a 
r ve 
ie = 


4.1 M41977 


